biobakery / MetaPhlAn

MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data
http://segatalab.cibio.unitn.it/tools/metaphlan/index.html
MIT License
301 stars 86 forks source link

Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1 #86

Closed srosales712 closed 4 years ago

srosales712 commented 4 years ago

Hi Our government servers do not have access to dropbox is there a way to get all the files needed to run metaPhlAn without using dropbox? I can download the file (file_list.txt)to my local computer, but it looks like it still calls files through dropbox?

Thanks, Stephanie

fbeghini commented 4 years ago

Hi Stephanie, do you have access to Google Drive? I can upload the database there

fbeghini commented 4 years ago

Hi Stephanie, I've mirrored the databases on Google Drive, you can manually download the tar and md5 files for the database version you want here , extract the tar, and build the bowtie2 indices by running e.g. bowtie2build mpa_v20_m200.fna mpa_v20_m200. if you want to build the MetaPhlAn2 database

dangchenyuan commented 4 years ago

@fbeghini Hi, I just wonder what does the name of database (mpa_v20, mpa_v292, mpa_v30) means? Does that v30 means the latest version? Is it ok to use v30 only?

fbeghini commented 4 years ago

The prefix is the database version, v30 is the latest one and it is the one to use


Francesco Beghini

PhD Student

Lab. of Computational Metagenomics

Department of Cellular, Computational and Integrative Biology - CIBIO

University of Trento Via Sommarive 9, 38123 Trento, Italy

Il Gio 14 Mag 2020, 04:06 dangchenyuan notifications@github.com ha scritto:

@fbeghini https://github.com/fbeghini Hi, I just wonder what does the name of database (mpa_v20, mpa_v292, mpa_v30) means? Does that v30 means the latest version? Is it ok to use v30 only?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biobakery/MetaPhlAn/issues/86#issuecomment-628341953, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGKRKL4ND45DUPW2B6U5Y3RRNG23ANCNFSM4MVXGDXA .

sclirl commented 4 years ago

Hi, I got the similar error, as follows:

~/.../miniconda3/bin/metaphlan test.fq --input_type fastq -o profiled_metagenome.txt

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1

Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1 Traceback (most recent call last): File "~/.../miniconda3/bin/metaphlan", line 10, in sys.exit(main()) File "~/.../miniconda3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 1187, in main pars['index'] = check_and_install_database(pars['index'], pars['bowtie2db'], pars['bowtie2_build'], pars['nproc'], pars['force_download']) File "~/.../miniconda3/lib/python3.7/site-packages/metaphlan/metaphlan.py", line 589, in check_and_install_database index = resolve_latest_database(bowtie2_db, ls_f['mpa_latest'], force_redownload_latest) UnboundLocalError: local variable 'ls_f' referenced before assignment

How to solve this error or download the warning file? Any suggestions will be greatly appreciated.

Thanks, Ruilin

fbeghini commented 4 years ago

Hi Ruilin, you can download just the mpa_v30 files from the link I posted in the https://github.com/biobakery/MetaPhlAn/issues/86#issuecomment-623339218 and follow the instructions included there

sclirl commented 4 years ago

Hi Thanks for your suggestion, there is still a problem, can you help me check which step has an error?

  1. download the tar from your Google drive: mpa_v30_CHOCOPhlAn_201901.tar

  2. tar the file : mpa_v30_CHOCOPhlAn_201901.tar using the following command: tar -xf mpa_mpa_v30_CHOCOPhlAn_201901.tar

  3. build the bowtie2 indices using the following command: /home/.../miniconda3/bin/bowtie2-build mpa_v30_CHOCOPhlAn_201901.fna mpa_v30_CHOCOPhlAn_201901

then, got 6 outputs: mpa_v30_CHOCOPhlAn_201901.4.bt2 mpa_v30_CHOCOPhlAn_201901.3.bt2 mpa_v30_CHOCOPhlAn_201901.1.bt2 mpa_v30_CHOCOPhlAn_201901.2.bt2 mpa_v30_CHOCOPhlAn_201901.rev.1.bt2 mpa_v30_CHOCOPhlAn_201901.rev.2.bt2

  1. Remove mpa_v30_CHOCOPhlAn_201901.tar, mpa_v30_CHOCOPhlAn_201901.fna, and the 6 outputs in step3 to the directory: /home/.../miniconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/

  2. Test and a new error would be thrown:

/home/.../miniconda3/bin/metaphlan /home/.../metaphlan_turorial/SRS014476-Supragingival_plaque.fasta.gz --input_type fasta > SRS014476-Supragingival_plaque_profile.txt OSError: fatal error running '/home/liruilin/miniconda3/lib/python3.7/site-packages/metaphlan/utils/read_fastx.py'. Is it in the system path?

Thanks, Ruilin

fbeghini commented 4 years ago

If you have installed MetaPhlAn using Anaconda, in your case, you should activate your base environment first in order to make findable all the scripts and run just metaphlan

maxibor commented 4 years ago

Hi @fbeghini , Similar error here because the dropbox download doesn't seem to work every time... Which combination of flags do you use to indicate Metaphlan (v2.8, from Bioconda) to use the local version of the database ?

I tried

--mpa_pkl mpa_v20_m200.pkl --bowtie2db mpa_v20_m200/mpa_v20_m200

But these flags always trigger the Dropbox copy of the Metaphlan DB being downloaded (same using -x mpa_v20_m200/mpa_v20_m200)

$ ls mpa_v20_m200/
mpa_v20_m200.3.bt2      mpa_v20_m200.rev.2.bt2
mpa_v20_m200.1.bt2  mpa_v20_m200.4.bt2
mpa_v20_m200.2.bt2  mpa_v20_m200.rev.1.bt2
fbeghini commented 4 years ago

Hi @maxibor , that combination is not working anymore since --mpa_pkl is deprecated. You can use --bowtie2db mpa_v20_m200 to point MetaPhlAn to the database folder, mpa_pkl is automatically set at the database name (v20_m200).

MengZhang2019 commented 4 years ago

Hi, I got the similar error, as follows: "Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1"

Then I download the "mpa_v30_CHOCOPhlAn_201901.tar" from Google Drive. And I use the next command-line build the index: $ bowtie2-build mpa_v30_CHOCOPhlAn_201901.fna mpa_v30_CHOCOPhlAn_201901

Then I run the mpa again, but it still download the database, and there is still the same problem: """ $ metaphlan A76_1.clean.fq,A76_2.clean.fq --input_type fastq --bowtie2out A76.bowtie2.bz2 --index /home/zhangm/miniconda3/envs/mpa/1_db/mpa_v30_CHOCOPhlAn_201901 -o A76_mpa2.txt

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1

Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1 """ Or, when I change the --index to --bowtie2db, these is the same problem.

How to solve this error? Any suggestions will be greatly appreciated.

Thanks, Meng

wfgui commented 4 years ago

hi,@fbeghini I download the "mpa_v30_CHOCOPhlAn_201901.tar" from Google Drive. Then I use the next command-line build the index:

$ bowtie2-build ~/Database/humann3/metaphlan3/mpa_v30_CHOCOPhlAn_201901.fna ~/Database/humann3/metaphlan3/mpa_v30_CHOCOPhlAn_201901

Then I run the mpa again, but it still download the database, and there is still the same problem

metaphlan demo.fastq -o profile.txt --nproc 5 --bowtie2db ~/Database/humann3/metaphlan3/mpa_v30_CHOCOPhlAn_201901 --input_type fastq

Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1

fbeghini commented 4 years ago

Hi @MengZhang2019 @hjdong , I've pushed a fix for MetaPhlAn, please pull and re-install the software from source code. Now, if the database name is provided using --index, MetaPhlAn will not try to download any files from the Internet.

metaphlan demo.fastq -o profile.txt --nproc 5 --bowtie2db <folder> --index mpa_v30_CHOCOPhlAn_201901

Let me know if any problems arise

MengZhang2019 commented 4 years ago

Hi @fbeghini , Thank u help us. I manually download the MetaPhlAn-3.0.zip from GitHub, then I install some Pre-requisites software and run: python setup.py install It remind me "Finished processing dependencies for MetaPhlAn==3.0" Then I run the next command: metaphlan A76_1.clean.fq,A76_2.clean.fq --input_type fastq --bowtie2out A76.bowtie2.bz2 --bowtie2db /home/zhangm/metaphlan_databases --index /home/zhangm/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901 -o A76.txt my folder include: mpa_v30_CHOCOPhlAn_201901.1.bt2 mpa_v30_CHOCOPhlAn_201901.4.bt2 mpa_v30_CHOCOPhlAn_201901.rev.2.bt2 mpa_v30_CHOCOPhlAn_201901.2.bt2 mpa_v30_CHOCOPhlAn_201901.pkl mpa_v30_CHOCOPhlAn_201901.3.bt2 mpa_v30_CHOCOPhlAn_201901.rev.1.bt2 But it will still show the same error: `Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1

Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1`

If I miss some thing or make some thing wrong?

Thanks, Meng

fbeghini commented 4 years ago

--index should have only the name of the database and not the full path


Francesco Beghini

PhD Student

Lab. of Computational Metagenomics

Department of Cellular, Computational and Integrative Biology - CIBIO

University of Trento Via Sommarive 9, 38123 Trento, Italy

Il Gio 28 Mag 2020, 18:38 MengZhang2019 notifications@github.com ha scritto:

Hi @fbeghini https://github.com/fbeghini , Thank u help us. I manually download the MetaPhlAn-3.0.zip from GitHub, then I install some Pre-requisites software and run: python setup.py install It remind me "Finished processing dependencies for MetaPhlAn==3.0" Then I run the next command: metaphlan A76_1.clean.fq,A76_2.clean.fq --input_type fastq --bowtie2out A76.bowtie2.bz2 --bowtie2db /home/zhangm/metaphlan_databases --index /home/zhangm/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901 -o A76.txt my folder include: mpa_v30_CHOCOPhlAn_201901.1.bt2 mpa_v30_CHOCOPhlAn_201901.4.bt2 mpa_v30_CHOCOPhlAn_201901.rev.2.bt2 mpa_v30_CHOCOPhlAn_201901.2.bt2 mpa_v30_CHOCOPhlAn_201901.pkl mpa_v30_CHOCOPhlAn_201901.3.bt2 mpa_v30_CHOCOPhlAn_201901.rev.1.bt2 But it will still show the same error: `Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1

Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1`

If I miss some thing or make some thing wrong?

Thanks, Meng

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biobakery/MetaPhlAn/issues/86#issuecomment-635459890, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGKRKIJS3QHNTFLSRWO7NLRT2HQRANCNFSM4MVXGDXA .

wfgui commented 4 years ago

Hi,@fbeghini Thank u help us. But it still show the same error. My command:

metaphlan demo.fastq -o profile.txt --nproc 5 --bowtie2db ~/Database/humann3/metaphlan3/ --index mpa_v30_CHOCOPhlAn_201901 --input_type fastq

my folder ~/Database/humann3/metaphlan3/include:

mpa_v30_CHOCOPhlAn_201901.1.bt2 
mpa_v30_CHOCOPhlAn_201901.2.bt2
mpa_v30_CHOCOPhlAn_201901.3.bt2
mpa_v30_CHOCOPhlAn_201901.4.bt2
mpa_v30_CHOCOPhlAn_201901.rev.1.bt2
mpa_v30_CHOCOPhlAn_201901.rev.2.bt2
mpa_v30_CHOCOPhlAn_201901.pkl 

Thanks.

MengZhang2019 commented 4 years ago

@fasnicar I don't know what the problem is, I run the mpa3 use this: metaphlan A76_1.clean.fq,A76_2.clean.fq --input_type fastq --bowtie2out A76.bowtie2.bz2 --bowtie2db /home/zhangm/metaphlan_databases --index mpa_v30_CHOCOPhlAn_201901 -o A76.txt It still shows an error, But I get the result. WARNING MESSAGE: `Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1

Warning: Unable to download https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1 Use of uninitialized value $bt2_args[2] in join or string at /usr/bin/bowtie2 line 423. Use of uninitialized value $bt2args[3] in join or string at /usr/bin/bowtie2 line 423. Use of uninitialized value $[2] in string eq at /usr/bin/bowtie2 line 360. Use of uninitialized value $_[3] in string eq at /usr/bin/bowtie2 line 360. Use of uninitialized value in exists at /usr/bin/bowtie2 line 81. Use of uninitialized value in exists at /usr/bin/bowtie2 line 81. Use of uninitialized value $bt2_args[2] in join or string at /usr/bin/bowtie2 line 459. Use of uninitialized value $bt2_args[3] in join or string at /usr/bin/bowtie2 line 459. WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant. An additional column listing the merged species is added to the MetaPhlAn output.`

Result folder A76_1.clean.fq A76_2.clean.fq A76.bowtie2.bz2 A76.txt Is this normal?

Thanks, Meng

fbeghini commented 4 years ago

@MengZhang2019 That's strange, you should not have the "Downloading ..." message. Do you have all this files in /home/zhangm/metaphlan_databases ?

mpa_v30_CHOCOPhlAn_201901.1.bt2 
mpa_v30_CHOCOPhlAn_201901.2.bt2
mpa_v30_CHOCOPhlAn_201901.3.bt2
mpa_v30_CHOCOPhlAn_201901.4.bt2
mpa_v30_CHOCOPhlAn_201901.rev.1.bt2
mpa_v30_CHOCOPhlAn_201901.rev.2.bt2
mpa_v30_CHOCOPhlAn_201901.pkl 
MengZhang2019 commented 4 years ago

@fbeghini Yes, all of these. Although it still remind this error, I can get the result of mpa3. `#mpa_v30_CHOCOPhlAn_201901

SampleID | Metaphlan_Analysis

clade_name | NCBI_tax_id | relative_abundance

sDialister_sp_CAG_357 | 1262869 | 55.75441 s__Faecalibacterium_prausnitzii | 853 | 7.29694 sEubacterium_rectale | 39491 | 5.59381 s__Escherichia_coli | 562 | 5.21096 s__Bacteroides_vulgatus | 821 | 4.08146 ... ...` You can check this issue, or you can ignore it if it's not influenced the result.

Thanks, Meng

fangling0913 commented 4 years ago

Hi, @fbeghini I download the dataset mpa_v30 and build the index. But I still get the download message: Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1 the files in /data/software/ONCO/anaconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/ mpa_v30_CHOCOPhlAn_201901.rev.1.bt2 mpa_v30_CHOCOPhlAn_201901.rev.2.bt2 mpa_v30_CHOCOPhlAn_201901.1.bt2 mpa_v30_CHOCOPhlAn_201901.2.bt2 mpa_v30_CHOCOPhlAn_201901.3.bt2 mpa_v30_CHOCOPhlAn_201901.4.bt2 mpa_v30_CHOCOPhlAn_201901.pkl

metaphlan R1.fastq,R2.fastq --nproc 8 --bowtie2db /data/software/ONCO/anaconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/ --index mpa_v30_CHOCOPhlAn_201901 --input_type fastq -o 06800863_1_result.txt

fbeghini commented 4 years ago

Could you try installing it from source? I've pushed a quick fix for this

Francesco Beghini PhD Student

Lab. of Computational Metagenomics Department of Cellular, Computational and Integrative Biology - CIBIO University of Trento Via Sommarive 9, 38123 Trento, Italy

On Tue, Jun 16, 2020 at 10:39 AM fangling0913 notifications@github.com wrote:

Hi, @fbeghini https://github.com/fbeghini I download the dataset mpa_v30 and build the index. But I still get the download message: Downloading https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAA4XDP85WHon_eHvztxkamTa/file_list.txt?dl=1 the files in /data/software/ONCO/anaconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/

mpa_v30_CHOCOPhlAn_201901.rev.1.bt2 mpa_v30_CHOCOPhlAn_201901.rev.2.bt2 mpa_v30_CHOCOPhlAn_201901.1.bt2 mpa_v30_CHOCOPhlAn_201901.2.bt2 mpa_v30_CHOCOPhlAn_201901.3.bt2 mpa_v30_CHOCOPhlAn_201901.4.bt2

metaphlan R1.fastq,R2.fastq --nproc 8 --bowtie2db /data/software/ONCO/anaconda3/lib/python3.7/site-packages/metaphlan/metaphlan_databases/ --index mpa_v30_CHOCOPhlAn_201901 --input_type fastq -o 06800863_1_result.txt

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biobakery/MetaPhlAn/issues/86#issuecomment-644622691, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGKRKMGXRHYMPPGWP7CK5TRW4VUXANCNFSM4MVXGDXA .

fangling0913 commented 4 years ago

@fbeghini the fix still doesn't work.

fbeghini commented 4 years ago

Could you please post here the content of lines 247-260 (check_and_install_database) from /data/software/ONCO/anaconda3/lib/python3.7/site-packages/metaphlan/__init__.py? I've run exactly your command line and the download attempt is not executed on my side.

fangling0913 commented 4 years ago

@fbeghini sorry, I check the file. I do install it from source, but the file didn't change. I replace lines 247-260 according to the file on github. It works now. Thanks

wfgui commented 4 years ago

Question: Unable to download https://bitbucket.org/biobakery/metaphlan2/downloads/mpa_v20_m200.tar Can anyone tell me the link to the database that can be downloaded in China

Here,https://drive.google.com/drive/folders/1_HaY16mT7mZ_Z8JtesH8zCfG9ikWcLXG

BOASE-Kate commented 1 year ago

Hi Stephanie, I've mirrored the databases on Google Drive, you can manually download the tar and md5 files for the database version you want here , extract the tar, and build the bowtie2 indices by running e.g. bowtie2build mpa_v20_m200.fna mpa_v20_m200. if you want to build the MetaPhlAn2 database

Just wanted to add here I have had a similar issue and running "bowtie2build mpa_v20_m200.fna mpa_v20_m200" Wouldn't work but running "bowtie2-build mpa_v20_m200.fna mpa_v20_m200" did work.