biobakery / MetaPhlAn

MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data
http://segatalab.cibio.unitn.it/tools/metaphlan/index.html
MIT License
298 stars 85 forks source link

KeyError: 'mpa_mpa_v296_CHOCOPhlAn_201901.tar' #78

Closed kescobo closed 4 years ago

kescobo commented 4 years ago

On a fresh conda install (`metaphlan 3.0 pyh5ca1d4c_1 bioconda) with python 3.6, nothing else in environment:

$ metaphlan --input_type fastq kneaddata/C0005_3F_1A_1000k_1_kneaddata.fastq -o ./

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1
Downloading file of size: 0.00 MB
0.01 MB 232.33 %   4.76 MB/sec  0 min -0 sec
Downloading MetaPhlAn database
Please note due to the size this might take a few minutes

File /home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan_databases/file_list.txt already present!
Traceback (most recent call last):
  File "/home/vklepacc/miniconda3/envs/metaphlan3/bin/metaphlan", line 10, in <module>
    sys.exit(main())
  File "/home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan.py", line 1187, in main
    pars['index'] = check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'], pars['force_download'])
  File "/home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan.py", line 610, in check_and_install_database
    download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc)
  File "/home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan.py", line 463, in download_unpack_tar
    url_tar_file = ls_f["mpa_" + download_file_name + ".tar"]
KeyError: 'mpa_mpa_v296_CHOCOPhlAn_201901.tar'
fbeghini commented 4 years ago

That's strange, that bug was fixed in the second build, also it seems that conda installed the previous version, the latest build is

metaphlan 3.0 pyh5ca1d4c_2 bioconda


Francesco Beghini PhD Student

Lab. of Computational Metagenomics Department of Cellular, Computational and Integrative Biology - CIBIO University of Trento Via Sommarive 9, 38123 Trento, Italy

On Tue, Apr 7, 2020 at 4:13 PM Kevin Bonham notifications@github.com wrote:

On a fresh conda install (`metaphlan 3.0 pyh5ca1d4c_1 bioconda) with python 3.6, nothing else in environment:

$ metaphlan --input_type fastq kneaddata/C0005_3F_1A_1000k_1_kneaddata.fastq -o ./

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1 Downloading file of size: 0.00 MB 0.01 MB 232.33 % 4.76 MB/sec 0 min -0 sec Downloading MetaPhlAn database Please note due to the size this might take a few minutes

File /home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan_databases/file_list.txt already present! Traceback (most recent call last): File "/home/vklepacc/miniconda3/envs/metaphlan3/bin/metaphlan", line 10, in sys.exit(main()) File "/home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan.py", line 1187, in main pars['index'] = check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'], pars['force_download']) File "/home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan.py", line 610, in check_and_install_database download_unpack_tar(FILE_LIST, index, bowtie2_db, bowtie2_build, nproc) File "/home/vklepacc/miniconda3/envs/metaphlan3/lib/python3.6/site-packages/metaphlan/metaphlan.py", line 463, in download_unpack_tar url_tar_file = lsf["mpa" + download_file_name + ".tar"] KeyError: 'mpa_mpa_v296_CHOCOPhlAn_201901.tar'

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biobakery/MetaPhlAn/issues/78, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGKRKJKRVMTYYWOXMGTTGTRLMYI3ANCNFSM4MDEYTQQ .

kescobo commented 4 years ago

@fbeghini Thanks for quick response - are there any requirements for that build besides python 3.6? I just installed it last night 🤔

I can try adding that build manually...

kescobo commented 4 years ago

Tried conda install -c bioconda metaphlan=3=pyh5ca1d4c_2 and got


Solving environment: failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Package pandas conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> pandas
Package samtools conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> samtools[version='>=1.9']
Package biom-format conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> biom-format
Package numpy conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> numpy
Package requests conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> requests
Package libstdcxx-ng conflicts for:
python=3.6 -> libstdcxx-ng[version='>=7.2.0|>=7.3.0']
Package cmseq conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> cmseq
Package phylophlan conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> phylophlan
Package tk conflicts for:
python=3.6 -> tk[version='8.6.*|>=8.6.7,<8.7.0a0|>=8.6.8,<8.7.0a0']
Package libffi conflicts for:
python=3.6 -> libffi[version='3.2.*|>=3.2.1,<4.0a0']
Package pysam conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> pysam
Package ld_impl_linux-64 conflicts for:
python=3.6 -> ld_impl_linux-64
Package bowtie2 conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> bowtie2[version='>=2.3.0']
Package sqlite conflicts for:
python=3.6 -> sqlite[version='>=3.20.1,<4.0a0|>=3.22.0,<4.0a0|>=3.23.1,<4.0a0|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.26.0,<4.0a0|>=3.29.0,<4.0a0|>=3.30.1,<4.0a0|>=3.31.1,<4.0a0']
Package scipy conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> scipy
Package readline conflicts for:
python=3.6 -> readline[version='7.*|>=7.0,<8.0a0|>=8.0,<9.0a0']
Package ncurses conflicts for:
python=3.6 -> ncurses[version='6.0.*|>=6.0,<7.0a0|>=6.1,<7.0a0|>=6.2,<7.0a0']
Package blast conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> blast[version='>=2.6.0']
Package xz conflicts for:
python=3.6 -> xz[version='>=5.2.3,<6.0a0|>=5.2.4,<6.0a0']
Package biopython conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> biopython
Package openssl conflicts for:
python=3.6 -> openssl[version='1.0.*|1.0.*,>=1.0.2l,<1.0.3a|>=1.0.2m,<1.0.3a|>=1.0.2n,<1.0.3a|>=1.0.2o,<1.0.3a|>=1.0.2p,<1.0.3a|>=1.1.1a,<1.1.2a|>=1.1.1c,<1.1.2a|>=1.1.1d,<1.1.2a|>=1.1.1e,<1.1.2a']
Package libgcc-ng conflicts for:
python=3.6 -> libgcc-ng[version='>=7.2.0|>=7.3.0']
Package zlib conflicts for:
python=3.6 -> zlib[version='>=1.2.11,<1.3.0a0']
Package dendropy conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> dendropy
Package raxml conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> raxml[version='>=8.2.10']
Package pip conflicts for:
python=3.6 -> pip
Package muscle conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> muscle[version='>=3.8.1551']
Package matplotlib-base conflicts for:
metaphlan==3=pyh5ca1d4c_2 -> matplotlib-base
fbeghini commented 4 years ago

No, only Python is needed. I've tried creating the env only with Python (no version specified) and then install the package. Do you have configured all the bioconda channels and updated conda?


Francesco Beghini PhD Student

Lab. of Computational Metagenomics Department of Cellular, Computational and Integrative Biology - CIBIO University of Trento Via Sommarive 9, 38123 Trento, Italy

On Tue, Apr 7, 2020 at 4:39 PM Kevin Bonham notifications@github.com wrote:

Tried conda install -c bioconda metaphlan=3=pyh5ca1d4c_2 and got

Solving environment: failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Package pandas conflicts for: metaphlan==3=pyh5ca1d4c_2 -> pandas Package samtools conflicts for: metaphlan==3=pyh5ca1d4c_2 -> samtools[version='>=1.9'] Package biom-format conflicts for: metaphlan==3=pyh5ca1d4c_2 -> biom-format Package numpy conflicts for: metaphlan==3=pyh5ca1d4c_2 -> numpy Package requests conflicts for: metaphlan==3=pyh5ca1d4c_2 -> requests Package libstdcxx-ng conflicts for: python=3.6 -> libstdcxx-ng[version='>=7.2.0|>=7.3.0'] Package cmseq conflicts for: metaphlan==3=pyh5ca1d4c_2 -> cmseq Package phylophlan conflicts for: metaphlan==3=pyh5ca1d4c_2 -> phylophlan Package tk conflicts for: python=3.6 -> tk[version='8.6.|>=8.6.7,<8.7.0a0|>=8.6.8,<8.7.0a0'] Package libffi conflicts for: python=3.6 -> libffi[version='3.2.|>=3.2.1,<4.0a0'] Package pysam conflicts for: metaphlan==3=pyh5ca1d4c_2 -> pysam Package ld_impl_linux-64 conflicts for: python=3.6 -> ld_impl_linux-64 Package bowtie2 conflicts for: metaphlan==3=pyh5ca1d4c_2 -> bowtie2[version='>=2.3.0'] Package sqlite conflicts for: python=3.6 -> sqlite[version='>=3.20.1,<4.0a0|>=3.22.0,<4.0a0|>=3.23.1,<4.0a0|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.26.0,<4.0a0|>=3.29.0,<4.0a0|>=3.30.1,<4.0a0|>=3.31.1,<4.0a0'] Package scipy conflicts for: metaphlan==3=pyh5ca1d4c_2 -> scipy Package readline conflicts for: python=3.6 -> readline[version='7.|>=7.0,<8.0a0|>=8.0,<9.0a0'] Package ncurses conflicts for: python=3.6 -> ncurses[version='6.0.|>=6.0,<7.0a0|>=6.1,<7.0a0|>=6.2,<7.0a0'] Package blast conflicts for: metaphlan==3=pyh5ca1d4c_2 -> blast[version='>=2.6.0'] Package xz conflicts for: python=3.6 -> xz[version='>=5.2.3,<6.0a0|>=5.2.4,<6.0a0'] Package biopython conflicts for: metaphlan==3=pyh5ca1d4c_2 -> biopython Package openssl conflicts for: python=3.6 -> openssl[version='1.0.|1.0.,>=1.0.2l,<1.0.3a|>=1.0.2m,<1.0.3a|>=1.0.2n,<1.0.3a|>=1.0.2o,<1.0.3a|>=1.0.2p,<1.0.3a|>=1.1.1a,<1.1.2a|>=1.1.1c,<1.1.2a|>=1.1.1d,<1.1.2a|>=1.1.1e,<1.1.2a'] Package libgcc-ng conflicts for: python=3.6 -> libgcc-ng[version='>=7.2.0|>=7.3.0'] Package zlib conflicts for: python=3.6 -> zlib[version='>=1.2.11,<1.3.0a0'] Package dendropy conflicts for: metaphlan==3=pyh5ca1d4c_2 -> dendropy Package raxml conflicts for: metaphlan==3=pyh5ca1d4c_2 -> raxml[version='>=8.2.10'] Package pip conflicts for: python=3.6 -> pip Package muscle conflicts for: metaphlan==3=pyh5ca1d4c_2 -> muscle[version='>=3.8.1551'] Package matplotlib-base conflicts for: metaphlan==3=pyh5ca1d4c_2 -> matplotlib-base

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biobakery/MetaPhlAn/issues/78#issuecomment-610425701, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGKRKLJZS7L3CRJPTZOT4DRLM3JBANCNFSM4MDEYTQQ .

kescobo commented 4 years ago

I did not have most recent version of conda, but I just cleared the base environment, updated conda, made a brand new environment, and tried installing pyh5ca1d4c_2 again:

$ conda install -c bioconda metaphlan=3=pyh5ca1d4c_2
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: -
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions
kescobo commented 4 years ago

So, I ended up installing from source. In my conda environment, I had to conda install -c bioconda biom-format, then from within the downloaded source code pip install .

fbeghini commented 4 years ago

Hi @kescobo, it seems that other people have the same issue (found this https://github.com/conda/conda/issues/9367) but different people solved using different solutions.

I tried again with a clean condo installation, I can reproduce the same error if I create a new env conda create -n mpa python with inside the latest Python version (3.8.2) and the install MetaPhlAn conda install metaphlan.

If I create an empty environment and then MetaPhlAn is installed, the correct Python version is fetched (3.7.6) and the installation goes smoothly. Also the right build is fetched (bioconda/noarch::metaphlan-3.0-pyh5ca1d4c_2)

I had to conda install -c bioconda biom-format, then from within the downloaded source code pip install .

Was biom-format not installed by the MetaPhlAn setup?

kescobo commented 4 years ago

Was biom-format not installed by the MetaPhlAn setup?

No, on the first attempt it couldn't build it for some reason, I didn't do much investigation as to why.

kescobo commented 4 years ago

Hey, it looks like I still have some of the terminal history - not all of it, but this gist has what I could get. Relevant part may be at the end:

Failed to build biom-format
ERROR: Could not build wheels for biom-format which use PEP 517 and cannot be installed directly
fbeghini commented 4 years ago

Yes, it seems that gcc is missing, since it is required to build biom-format (requires cython, so gcc...)

kescobo commented 4 years ago

Ahh, that makes sense. Well, Let's call this closed for now - I think someone else running into this error will have plenty of things to try.

nahanoo commented 4 years ago

Running in the same issue I'll post some updates form my side because I don't think the error reason has been identified yet.

Installing into a new conda environment fetched python 3.7.6 and bioconda/noarch::metaphlan-3.0-pyh5ca1d4c_2. This still resulted in the KeyError. GCC and biom-format were installed as well.

Only installing from source seemed to do the trick. Btw the master branch ist not 3.0 version and also doesn't come with a setup.py file so maybe mention in the instructions that people need to checkout branch 3.0

Let me know if I can help.

Cheers

fbeghini commented 4 years ago

I'll bump a new conda build with the updates included in 3.0

Thank you for the advice, I'll mention to checkout to the 3.0 in the manual

nahanoo commented 4 years ago

Awesome, thank you!

simmalysimle commented 3 years ago

Hi, I also have these same issues。 image Looking forward to your answers, thank you very much!