maxplanck-ie / snakepipes

Customizable workflows based on snakemake and python for the analysis of NGS data
http://snakepipes.readthedocs.io
378 stars 85 forks source link

snakePipes createEnvs - missing mamba not reported. mamba in base #896

Closed mirax87 closed 1 year ago

mirax87 commented 1 year ago

Hey there,

I am in the process of installing snakePipes and followed the instructions on RTD line by line (after trying out several failed Quick&Dirty attempts, which might even have had worked). After following each step exactly, the snakePipes createEnvs created an issue for me, in which the script aborted in the following line a cryptic try-except.

Eventually, it turned out, that my mamba (from the base environment) was not presented in the snakePipes environment, so I went back to the base environment, hacked my way through the relevant paths and executed - from the base environment - the following command (currently running and looking good).

target="$HOME/snakePipes_config/envs/"
(base)$ <path to miniconda3>/envs/snakePipes/bin/snakePipes createEnvs --condaDir $target

Point 1: Is this a valid path for the installation of environments? I probably see it as soon as I run createIndices and the other workflow modules.

Point 2: It would be great to have a more descriptive error message in the line of code mentioned above. In this installation attempt, I started from an entirely fresh conda and snakePipes installation.

Cheers, -Michael

adRn-s commented 1 year ago

I am unable to reproduce the issue, for me these steps are working.

conda install mamba -c conda-forge
mamba create -n snakePipes -c mpi-ie -c conda-forge -c bioconda snakePipes
conda activate snakePipes
snakePipes createEnvs

Granted, the createEnvs takes time... so we'll have to wait until tomorrow when I get back and have a look at the end result. I am currently running these inside a container (docker run -it continuumio/miniconda3 bash)

PS. You got this right: mamba should only be installed in the base environment.

mirax87 commented 1 year ago

Hey, thanks for the quick answer.

In fact, in my reported issue, createEnvs broke for me already with the very first environment. i.e. you would see the error response right away.

Also it might be more useful to have a more explicit error message in the code. At this point it returns "There was an error when creating the environments!" as shown in bin/snakePipes, line 381.


Some background discussion:

To my current understanding of the code, if mamba is presented in base only, snakePipes createEnv must break after conda activate snakePipes. That is because snakePipes createEnv calls mamba to install the packages from the given <env>.yaml. See here bin/snakesPipes, line 357

[...]
conda activate snakePipes     <<< mamba present in base env, not in snakePipes env
snakePipes createEnvs            <<< breaks here
adRn-s commented 1 year ago

So the createEnvs I left running yesterday finished now. No errors whatsoever. Could you share a reproducible example?

It doesn't break. Check it out:

conda activate snakePipes which mamba

You'll see the path to mamba on the base env. In my case, inside the fresh docker container, this is /opt/conda/condabin/mamba .

mirax87 commented 1 year ago

For sharing reproducible behavior, I probably need to carry over my machine. ^^

Let's look into conda's behavior first.


Currently, I believe that my conda setup behaves differently to the snakePipes-expected behavior. In the activated environment, yours displays base-packages, mine doesn't, as you can see below.

My current work-around is to start from the (base) environment, where mamba is availabe, and use /path/to/snakePipes createEnvs and /path/to/createIndices.

For running snakePipes, I have installed conda and mamba to my specified snakePipes environment (testing in progress).


Conda behavior

In my case the base env conda and mamba are not displayed within the snakePipes environment. In the following snippet, conda falls back to an unrelated conda installation.

(base) user:~$ which mamba ; which conda
$HOME/miniconda3/bin/mamba
$HOME/miniconda3/bin/conda
(base) user:~$ conda create -n snakePipes_dummy
Collecting package metadata (current_repodata.json): done
Solving environment: done
[…]
(base) user:~$ conda activate snakePipes_dummy
(snakePipes_dummy) user:~$  which mamba; which conda
/opt/conda/condabin/conda                       <<< unrelated conda, mamba missing.
adRn-s commented 1 year ago

It's odd that the machine has two different conda installations, the one at /opt/conda/condabin/conda came as a surprise to me... I'm inclined to say that you need to check .bashrc (if bash is your shell) and verify which conda installation is really being sourced/ initialized.

Btw, conda is an executable file, but it's usually overriden by a shell fuction during the initialization. It's this one:

conda () {
    \local cmd="${1-__missing__}"
    case "$cmd" in
        (activate | deactivate) __conda_activate "$@" ;;
        (install | update | upgrade | remove | uninstall) __conda_exe "$@" || \return
            __conda_reactivate ;;
        (*) __conda_exe "$@" ;;
    esac
}

This actually uses another function, __conda_exe ...

__conda_exe () {
    (
        "$CONDA_EXE" $_CE_M $_CE_CONDA "$@"
    )
}

...that goes to execute whatever filepath the env-var $CONDA_EXE is pointing to. This means, that you may also check this env var value as well as the outputs from running which...

Having more than 1 conda installation, if the initialization is not careful, will be troublesome.

Also, installing mamba in any environment other than base is discouraged by the mamba developers. Check the warning here, it says "Installing mamba into any other environment than base is not supported.". We already went that way during a past release of snakePipes, and ended up having the snakePipes' workflow environments created inside the snakePipe env (nested)... it was broken. I'm afraid you could end up in a similar situation with your current solution.

adRn-s commented 1 year ago

I believe all this would be fixed if mamba was present in the base env of the system conda. Can't you ask the admin of this machine/ server to do so?

mirax87 commented 1 year ago

Thanks for all the insights! I am not sure whether I would receive that 'fundamental' mamba installation. As for the $CONDA_EXE that never points to the 'unrelated' conda bin. I would not start messing around with that 'unlreated' conda, rather stay within my $HOME installation.


Eventually, my installation worked by executing as followed: 1) (snakePipes-env)$ conda install mamba 2) (snakePipes-env)$ ~snakePipes createEnvs

I agree, that this goes against all recommendations, but this also seems like an isolated issue. So for this is working just fine.

If I ever try a clean installation again, I might install conda and mamba next to each other, instead of installing mamba through conda, trying to see, whether it works.


To conclude this thread:

I recommend to update the error message in the lines of questions to be more explanatory, in case mamba is not being found. (See snakePipes#L375 )

mirax87 commented 1 year ago

and my path failed. :-)

As you said, the environments were nested within the snakePipes env and caused unspecific trouble.

adRn-s commented 1 year ago

I believe all this would be fixed if mamba was present in the base env of the system conda. Can't you ask the admin of this machine/ server to do so?

If this isn't an option, your second option would be clearing out that system conda... Remember:

Having more than 1 conda installation, if the initialization is not careful, will be troublesome.

Just clean everything conda-related (should be under the comments that clearly indicate what they are) in your ~/.bashrc, and simply run the initialization code for/ from your actual user installation.

Then, something among the lines of...

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh -b -f -p /home/${USER}/my_anaconda
/home/${USER}/my_anaconda/bin/conda init bash
source ~/.bashrc
mirax87 commented 1 year ago

Assuming I screwed up with mamba/conda installations, I started from scratch and noticed a difference whether mamba is available or not.

I followed your instructions above, whereas I handle activation manually either through conda activate or mamba activate. Here, the latter leads to success in finding to (base) mamba.

There is a comment when using mamba activate|deactivate. Do you have a recommendation whether or not to follow that?


(base)$ mamba activate snakePipes-2.7.2
Run 'mamba init' to be able to run mamba activate/deactivate
and start a new shell session. Or use conda to activate/deactivate.
(snakePipes-2.7.2)$ which mamba
(snakePipes-2.7.2)$ $HOME/.miniconda3/bin/mamba                   <<< mamba found 
(snakePipes-2.7.2)$ mamba deactivate
Run 'mamba init' to be able to run mamba activate/deactivate
and start a new shell session. Or use conda to activate/deactivate.
(base)$ conda activate snakePipes-2.7.2
(snakePipes-2.7.2)$ which mamba
(snakePipes-2.7.2)$                                              <<< mamba missing
katsikora commented 1 year ago

Hi guys,

I'm closing due to inactivity, feel free to reopen if needed.

Best wishes,

Katarzyna