geronimp / enrichM

Toolbox for comparative genomics of MAGs
81 stars 22 forks source link

Issues installing/referencing the database #99

Open waoverholt opened 4 years ago

waoverholt commented 4 years ago

Hello,

I've been trying to install EnrichM and I keep running into the same problem (tried with 2 versions of python, 3.6 & 3.8).

When I try running enrichm data --output /path/ I get the following error

[2020-03-12 18:48:37 PM] INFO: Command: /opt/miniconda3/envs/enrichm/bin/enrichm data --output /data/databases/enrichM/
[2020-03-12 18:48:37 PM] INFO: Running the data pipeline
Traceback (most recent call last):
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/data.py", line 114, in do
    version_remote = urllib.request.urlopen(self.ftp + self.VERSION).readline().strip().decode("utf-8")
AttributeError: module 'urllib' has no attribute 'request'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/miniconda3/envs/enrichm/bin/enrichm", line 342, in <module>
    run.run_enrichm(args, sys.argv)
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/run.py", line 288, in run_enrichm
    d.do(args.uninstall, args.dry)
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/data.py", line 116, in do
    raise Exception(
Exception: Unable to find most current EnrichM database VERSION in ftp. Please complain at https://github.com/geronimp/enrichM

I tried to get around that by manually installing and unpacking the latest database version (v10). When I point ENRICHM_DB to the unpacked tarball I get this error:

Traceback (most recent call last):
  File "/opt/miniconda3/envs/enrichm/bin/enrichm", line 38, in <module>
    from enrichm.run import Run
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/run.py", line 24, in <module>
    from enrichm.network_analyzer import NetworkAnalyser
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/network_analyzer.py", line 22, in <module>
    from enrichm.network_builder import NetworkBuilder
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/network_builder.py", line 24, in <module>
    from enrichm.databases import Databases
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/databases.py", line 28, in <module>
    class Databases:
  File "/opt/miniconda3/envs/enrichm/lib/python3.8/site-packages/enrichm/databases.py", line 36, in Databases
    PICKLE_VERSION = open(os.path.join(CUR_DATABASE_DIR, 'VERSION')).readline().strip()
FileNotFoundError: [Errno 2] No such file or directory: '/data/databases/enrichM/enrichm_database_v10/26-11-2018/VERSION'

The first issue seems to be a urllib error, and I saw somewhere online that changing the import statement from import urllib to import urllib.request as urllib might fix it, but I haven't tried this modification yet.

The database error is clearly an issue with the path specification since the file VERSION within enrichm_database_v10 points to 26-11-2018 Probably because I haven't formatted something correctly that the enrichm data1 does.

To install enrichM I've tried:

conda create -n enrichm
conda activate enrichm
conda install -c bioconda mcl R hmmer diamond prodigal parallel openmp mmseqs2 moreutils seqmagick
conda install -c geronimp enrichm

I also forced it to try using python3.8:

conda create -n enrichm python=3.8
conda activate enrichm
conda install -c bioconda mcl R hmmer diamond prodigal parallel openmp mmseqs2 moreutils seqmagick
conda install -c geronimp enrichm
#dependency issue
pip install enrichm
#worked fine, but gave the exact same errors as above

About my environment:

    active environment : enrichm
    active env location : /opt/miniconda3/envs/enrichm
            shell level : 2
       user config file : /home/li49pol/.condarc
 populated config files : /home/li49pol/.condarc
          conda version : 4.8.2
    conda-build version : not installed
         python version : 2.7.11.final.0
       virtual packages : __glibc=2.27
       base environment : /opt/miniconda3  (read only)
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /opt/miniconda3/pkgs
                          /home/li49pol/.conda/pkgs
       envs directories : /home/li49pol/data/programs/conda
                          /home/li49pol/.conda/envs
                          /opt/miniconda3/envs
               platform : linux-64
             user-agent : conda/4.8.2 requests/2.22.0 CPython/2.7.11 Linux/4.15.0-64-generic ubuntu/18.04.3 glibc/2.27
                UID:GID : 1001:1001
             netrc file : None
           offline mode : False

python --version
#Python 3.8.1

Thanks for your time!

Best, Will

Abonacolta commented 4 years ago

I've been having the same issue. Let me know if you found a solution

DennyPopp commented 4 years ago

I had the same issue after manually installing the database (v10). I could solve it by creating a folder within the enrichm_database_v10 folder called '26-11-2018' and copying the VERSION file into this new folder. To my surprise - it worked! enrichM is happily running.

liupfskygre commented 3 years ago

Hi DennyPopp, Could you share how you manually installing the database and setup. thanks.

I installed enrichm using pip and have

  Version: 0.6.3

For what I have done, I do download the v10 and uncompress it here

~/bio_db/enrichm_db/

now set env path

ENRICHM_DB=~/bio_db/enrichm_db/enrichm_database_v10

I creati a folder within the enrichm_database_v10 folder called '26-11-2018' and copying the VERSION file into this new folder. but still error

any ideas

Traceback (most recent call last):
  File "/usr/local/bin/enrichm", line 352, in <module>
    run.run_enrichm(args, sys.argv)
  File "/usr/local/lib/python3.8/dist-packages/enrichm/run.py", line 409, in run_enrichm
    pipeline(args)
  File "/usr/local/lib/python3.8/dist-packages/enrichm/run.py", line 282, in run_annotate
    annotate = Annotate(# Define inputs and outputs
  File "/usr/local/lib/python3.8/dist-packages/enrichm/annotate.py", line 113, in __init__
    self.databases = Databases()
  File "/usr/local/lib/python3.8/dist-packages/enrichm/databases.py", line 71, in __init__
    raise Exception(f"\nNo database version file found. Have you: \n\
Exception: 
No database version file found. Have you: 
- Installed the EnrichM database using the 'enrichm data' command?
- Specified the location of the EnrichM database by exporting a bash variable called ENRICHM_DB? (Currently I'm looking here: /home/dell/bio_db/enrichm_db/enrichm_database_v10)