harvardinformatics / degenotate

MIT License
41 stars 4 forks source link

ModuleNotFoundError: No module named 'degenotate_lib' #21

Closed milesroberts-123 closed 1 year ago

milesroberts-123 commented 1 year ago

Hello! Today, I tried using degenotate as a conda environment to see if it could be easily incorporated into a snakemake workflow.

These are the exact commands I used:

# create degenotate environment
conda create --channel bioconda --name degenotate degenotate=1.0.0

# activate environment
conda activate degenotate

# call help options
degenotate.py -h

The last line above throws the following error:

Traceback (most recent call last):
  File "/mnt/home/robe1195/anaconda3/envs/degenotate/bin/degenotate.py", line 12, in <module>
    import degenotate_lib.core as CORE
ModuleNotFoundError: No module named 'degenotate_lib'

Any suggestions on how to fix this? Is this a bug with the conda release? The degenotate.py script works just fine if I clone the github repo. Should I use the environment.yaml file in the github repo to build the conda environment instead?

Thanks for your help!

gwct commented 1 year ago

Hi! The conda installation should work. I think I've had problems when I explicitly specify the channel as bioconda (e.g. --channel bioconda or -c bioconda. Can you try without specifying the channel to see if that works? I will try as well.

gwct commented 1 year ago

Hmm, I wasn't able to replicate this, though I used mamba instead of conda since conda is so slow on our filesystem here. I will try it again with conda just to be sure, but it will take a while.

Just as another check, when you're in the environment where degenotate is installed, can you tell me if there is a folder called degenotate_lib in the path $CONDA_PREFIX/lib/python3.8/site-packages/?

You may need to replace python3.8 with the version installed in your environment (just ls to $CONDA_PREFIX/lib/ to see what it is).

milesroberts-123 commented 1 year ago

Thank you so much for your quick help! So I removed my failed environment:

conda env remove --name degenotate

Then tried to rebuild it without specifying bioconda:

conda create --name degenotate degenotate=1.0.0

But got the same error after running degenotate.py -h:

Traceback (most recent call last):
  File "/mnt/home/robe1195/anaconda3/envs/degenotate/bin/degenotate.py", line 12, in <module>
    import degenotate_lib.core as CORE
ModuleNotFoundError: No module named 'degenotate_lib'

When I looked for the degenotate_lib folder, I found it in two different places: $CONDA_PREFIX/lib/python3.1/site-packages and $CONDA_PREFIX/lib/python3.10/site-packages. Maybe this is the source of the problem?

I removed the failed environment again and tried rebuilding it once more using mamba instead of conda:

# remove failed environment
conda env remove --name degenotate

# rebuild with mamba, without specifying channel
mamba create --name degenotate degenotate=1.0.0

# activate environment
conda activate degenotate

# test degenotate
degenotate.py -h

But again got the ModuleNotFoundError: No module named 'degenotate_lib' error with the degenotate_lib folder in two different places...

gwct commented 1 year ago

Hmm, ok. I also didn't encounter this problem when I created the environment with conda, so your original command should work (though I recommend leaving the --channel option off anyways, since it might not install the latest build with it for some reason). It is possible that it's getting confused about which site-packages folder to use, but I think it should still be able to use one of them. I'm actually not sure why there is a python3.1 in the environment though, since degenotate requires >= 3.10. Something is definitely getting mixed up though. A few more pieces of information that will be helpful:

import sys
sys.executable
sys.path

Thanks!

milesroberts-123 commented 1 year ago

I'm not sure what happened, but I restarted my computer and my degenotate environment appears to work just fine. For completeness, here's what you asked for:

which python gives v3.10.8

In interactive mode, the python version is 3.10.8 again

The sys executable and sys path in interactive mode are:

>>> import sys
>>> sys.path
['', '/mnt/home/robe1195/anaconda3/envs/degenotate/lib/python310.zip', '/mnt/home/robe1195/anaconda3/envs/degenotate/lib/python3.10', '/mnt/home/robe1195/anaconda3/envs/degenotate/lib/python3.10/lib-dynload', '/mnt/home/robe1195/anaconda3/envs/degenotate/lib/python3.10/site-packages']
>>> sys.executable
'/mnt/home/robe1195/anaconda3/envs/degenotate/bin/python'

And the conda list output:

# packages in environment at /mnt/home/robe1195/anaconda3/envs/degenotate:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.12.7            ha878542_0    conda-forge
degenotate                1.0.0              pyhdfd78af_1    bioconda
keyutils                  1.6.1                h166bdaf_0    conda-forge
krb5                      1.20.1               hf9c8cef_0    conda-forge
ld_impl_linux-64          2.39                 hcc3a1bd_1    conda-forge
libcurl                   7.86.0               h6312ad2_2    conda-forge
libdeflate                1.13                 h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libnghttp2                1.47.0               hdcd2b5c_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libsqlite                 3.40.0               h753d276_0    conda-forge
libssh2                   1.10.0               haa6b8db_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
networkx                  2.8.8              pyhd8ed1ab_0    conda-forge
openssl                   1.1.1s               h0b41bf4_1    conda-forge
pip                       22.3.1             pyhd8ed1ab_0    conda-forge
pysam                     0.20.0                   pypi_0    pypi
python                    3.10.8          h257c98d_0_cpython    conda-forge
python_abi                3.10                    3_cp310    conda-forge
readline                  8.1.2                h0f457ee_0    conda-forge
setuptools                65.5.1             pyhd8ed1ab_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
tzdata                    2022g                h191b570_0    conda-forge
wheel                     0.38.4             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge

To ensure that things were indeed still working, I removed my old degenotate environment and installed a new one with conda create --name degenotate degenotate=1.0.0. I no longer got an error after running degenotate.py -h

Hopefully that helps! I'm honestly not sure what changed... Thank you for your help!

milesroberts-123 commented 1 year ago

Oh wait, I think figured out what happened! So, I was testing if degenotate could be incorporated into a snakemake workflow and I was using a snakemake environment module using python version 3.8. If I activate the module and then call degenotate, I get the error I had before:

# activate snakemake module
ml -* iccifort/2020.1.217 impi/2019.7.217 snakemake/5.26.1-Python-3.8.2

# acitvate degenotate
conda activate degenotate

# call degenotate
degenotate.py -h
Traceback (most recent call last):
  File "/mnt/home/robe1195/anaconda3/envs/degenotate/bin/degenotate.py", line 12, in <module>
    import degenotate_lib.core as CORE
ModuleNotFoundError: No module named 'degenotate_lib'

I didn't get this error after restarting my computer because the environment modules deactivate upon restart.

So it looks like in order to use both degenotate and snakemake at the same time I'll need to install a snakemake version that uses python 3.10?

gwct commented 1 year ago

Ah, yes it looks like your snakemake module switches which python version you're using. So, not only is python 3.8 too early for degenotate, but it is also probably switching the system path within python so it can't see the site-packages folder from the environment. As you suggested, the way to fix this is to use a snakemake version that doesn't switch to a python version lower than 3.10. Fortunately, you should just be able to use conda/mambato install the most recent version of snakemake right in the degenotate environment (https://anaconda.org/search?q=snakemake):

mamba install snakemake=7.19.1

Again, leave of the -c bioconda option, or else it may not be able to find some of the dependencies. You may also want to consider installing snakemake-minimal to reduce the number of dependencies required.

Let me know if this works!

milesroberts-123 commented 1 year ago

Hello! I installed the latest version of snakemake (v7.19.1) into one conda environment and installed degenotate into a separate conda environment. I then wrote a snakemake rule to call degenotate. Snakemake would correctly activate my degenotate environment before calling degenotate.py but I was getting the same module not found error I had been getting before.

It seems that snakemake has a default behavior which blocks conda environments from changing the $PYTHONPATH environment variable. In other words, snakemake was correctly activating my degenotate conda environment (allowing access to degenotate.py), but would not change the python path from the version of python installed in my base conda environment (python v3.8). I tried disabling this behavior with the --conda-not-block-search-path-envvars option of snakemake, but I would still get the same module not found error.

What finally worked was specifying the path to the correct python version using $CONDA_PREFIX within my snakemake rule:

$CONDA_PREFIX/bin/python $CONDA_PREFIX/bin/degenotate.py --overwrite -d " " -a {input.annot} -g {input.genome} -o {params.outputFolder}

Here's what the full rule looks like in practice:

rule degenotate:
        input:
                genome="data/assemblies/{assembly}.fa",
                annot="data/annotations/{assembly}.gff3"
        output:
                "data/{assembly}_fourfoldDegenerateSites.bed"
        params:
                outputFolder="data/degenotateOutput/{assembly}"
        threads: 1
        conda:
                "../envs/degenotate.yml"
        shell:
                """
                # before running degenotate, check that correct python version is installed
                echo $CONDA_PREFIX

                # run degenotate, remove any extra information in fasta header after initial key
                # if previous run failed, overwrite that failed run
                $CONDA_PREFIX/bin/python $CONDA_PREFIX/bin/degenotate.py --overwrite -d " " -a {input.annot} -g {input.genome} -o {params.outputFolder}

                # subset out four-fold degenerate sites, degenotate has an option for this too
                awk '(($5 == 4))' {params.outputFolder}/degeneracy-all-sites.bed > {output}
                """

where degenotate.yml is simply:

name: degenotate

dependencies:
 - degenotate=1.0.0

Let me know if that makes sense. Bottom line: this seems to work as a way to call degenotate from snakemake!

Thanks very much for all of your help!

gwct commented 1 year ago

Awesome, glad there's a way to make it work! I think @tsackton has also been trying to work degenotate into a snakemake pipeline. I wonder if he's run into a similar issue and has any other solutions?

tsackton commented 1 year ago

I spent a little time working on this today. It looks like in order to get degenotate to install from conda with the correct python version via snakemake, you need to make sure that conda-forge and bioconda are both in your environment yaml file, e.g.:

channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
 - degenotate

At least for me this has fixed any issues with using degenotate with snakemake.

I am going to close this issue now but we can reopen it if there are new developments or bugs.