biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
128 stars 33 forks source link

/phylophlan_configs/" folder does not exists and unable to download phylophlan_databases.txt?dl=1 #33

Closed zyyalice closed 3 years ago

zyyalice commented 4 years ago

Hi

I would like to use phylophlan. but I got an error message when use the command as follows:

phylophlan -i AR_and_Bac/ -d phylophlan --diversity low -f supermatrix_nt.cfg --nproc 42

[e] "/home/emma/anaconda2/envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan_configs/" folder does not exists [e] unable to download "https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1"

I installed the phylophlan using the following two commands

1: conda create -n "phylophlan" -c bioconda phylophlan=3.0 then: conda activate phylophlan 2: phylophlan_write_default_configs.sh [output_folder] and then test: phylophlan --version PhyloPhlAn version 3.0.51 (11 May 2020)

but i got error when try basic usage

I don't know the reason. could you help me fix this ?

Thank you very much! Looking forwar for your reply!

fasnicar commented 4 years ago

Hi and thanks for reporting this.

I believe there is something not working on your installation because I just installed from scratch in a new env phylophlan and ran the very same commands you posted and it worked fine. The only difference I can see right now is that you're using anaconda2 while in my system I'm using anaconda3, not sure this can be the problem.

Thanks, Francesco

wqssf102 commented 4 years ago

Hi I would like to use phylophlan. but I got an error message when use the command as follows: phylophlan -i AR_and_Bac/ -d phylophlan --diversity low -f supermatrix_nt.cfg --nproc 42 [e] "/home/emma/anaconda2/envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan_configs/" folder does not exists [e] unable to download "https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1" I installed the phylophlan using the following two commands 1: conda create -n "phylophlan" -c bioconda phylophlan=3.0 then: conda activate phylophlan 2: phylophlan_write_default_configs.sh [output_folder] and then test: phylophlan --version PhyloPhlAn version 3.0.51 (11 May 2020) but i got error when try basic usage I don't know the reason. could you help me fix this ? Thank you very much! Looking forwar for your reply! Q1:预先创建一个文件夹:mkdir phylophlan_configs ,执行这个代码: phylophlan_write_default_configs.sh phylophlan_configs/ ,然后copy ‘phylophlan_configs’文件夹到‘/home/emma/anaconda2/envs/phylophlan/lib/python3.7/site-packages/phylophlan’ Q:根据这个链接“https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1”下载,然后下载里面的4个文件,解压,解压后的文件夹包含数据库,将数据库放在你以后指定的地方即可。 so sorry, my english is poor, you can use 'tools' to translate

zyyalice commented 4 years ago

您好,非常感谢您的答疑,万分感恩。

Q1:预先创建一个文件夹:mkdir phylophlan_configs ,执行这个代码: phylophlan_write_default_configs.sh phylophlan_configs/ ,然后copy ‘phylophlan_configs’文件夹到‘/home/emma/anaconda2/envs/phylophlan/lib/python3.7/site-packages/phylophlan’ Q:根据这个链接“[https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1”下载,然后下载里面的4个文件,解压,解压后的文件夹包含数据库,将数据库放在你以后指定的地方即可

还想再请教您一个问题: 您回复的Q1已经解决了我之前遇到的第一个Error。但关于-d database的这个参数,解压后放在指定位置的使用还有些疑问。

我的理解是在使用-d时直接给它下载数据库的存储路径?如命令 phylophlan -i test_genome -d /HD1/database/phylophlan --diversity low -f supermatrix_nt.cfg --nproc 42。 这样使用的时候会出错,所以这个-d参数的使用还是没有弄明白,还希望您能指导一下,谢谢!

非常感谢!谢谢您!

Hi I would like to use phylophlan. but I got an error message when use the command as follows: phylophlan -i AR_and_Bac/ -d phylophlan --diversity low -f supermatrix_nt.cfg --nproc 42 [e] "/home/emma/anaconda2/envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan_configs/" folder does not exists [e] unable to download "https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1" I installed the phylophlan using the following two commands 1: conda create -n "phylophlan" -c bioconda phylophlan=3.0 then: conda activate phylophlan 2: phylophlan_write_default_configs.sh [output_folder] and then test: phylophlan --version PhyloPhlAn version 3.0.51 (11 May 2020) but i got error when try basic usage I don't know the reason. could you help me fix this ? Thank you very much! Looking forwar for your reply! Q1:预先创建一个文件夹:mkdir phylophlan_configs ,执行这个代码: phylophlan_write_default_configs.sh phylophlan_configs/ ,然后copy ‘phylophlan_configs’文件夹到‘/home/emma/anaconda2/envs/phylophlan/lib/python3.7/site-packages/phylophlan’ Q:根据这个链接“https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1”下载,然后下载里面的4个文件,解压,解压后的文件夹包含数据库,将数据库放在你以后指定的地方即可。 so sorry, my english is poor, you can use 'tools' to translate

wqssf102 commented 4 years ago

您好: 假如您的命令是:phylophlan -i bins/ -d phylophlan --diversity medium -f supermatrix_aa.cfg -t a -o res/ 那么您在当前工作目录下创建一个目录:phylophlan_databases\phylophlan或phylophlan_databases/phylophlan,然后将解压后的数据库放进来就行。 若是您用:phylophlan -i bins/ -d amphora2 --diversity medium -f supermatrix_aa.cfg -t a -o res/ 那么在“phylophlan_databases”下创建一个“amphora2”文件夹,将amphora2解压后的文件放进来即可。 我在后面的分析里遇到了其他问题,第一次接触宏基因组,希望能向您请教,我的邮箱:565715597@qq.com,希望能通过邮件获得您的联系方式,谢谢。

Hocnonsense commented 4 years ago

I cannot download it two.

I created ~/anaconda/envs/phylophlan/lib/python3.9/site-packages/phylophlan/phylophlan_configs/phylophlan, but it also called '[e] unable to download "https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1"'

Then tried to download by wget https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1, and called "443... failed: Network is unreachable"

搭了梯子也不行

edited

大陆登不上 dropbox, 需要翻墙. 以下是 2020-11-07 21:07:50 的文档 (phylophlan_databases.txt) 内容:

  #database_name  database_url    database_md5
  amphora2    https://zenodo.org/record/4005745/files/amphora2.tar?download=1 https://zenodo.org/record/4005745/files/amphora2.md5?download=1
  phylophlan  https://zenodo.org/record/4005620/files/phylophlan.tar?download=1   https://zenodo.org/record/4005620/files/phylophlan.md5?download=1
fasnicar commented 4 years ago

Hi, I just checked it seems to work fine from my side. Can you please (1) report which PhyloPhlAn version you're using and (2) try again to see if it was a temporary network issue?

Many thanks, Francesco

nick-youngblut commented 3 years ago

How does one properly setup */envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan_configs without actually running phylophlan? I would like to run Strainphlan in parallel on a cluster that lacks an internet connection, so I need to download https://www.dropbox.com/s/x7cvma5bjzlllbt/phylophlan_databases.txt?dl=1 and setup of the phylophlan_configs prior to all of the strainphlan runs

fasnicar commented 3 years ago

Hi, for the configurations you should have not problems as the writing of the config file doesn't require an internet connection, only the tools to be available in the system. Also for the database, you don't need to download them as StrainPhlAn builds a custom database locally for the species of interest, so that should work without an internet connection. I'm adding here @abmiguez that will answer the same question asked in the bioBakery help forum.

nick-youngblut commented 3 years ago

Hi @fasnicar. Thanks for the quick response. Sorry for the double post. I think that the problem is due to makeblastdb requiring excessive memory when run as a cluster job (i.e., >300G of vmem required for 38 marker gene sequences).

fasnicar commented 3 years ago

Hi @nick-youngblut , no problem I just wanted to keep the discussion in one place. Thanks for finding the problem with makeblastdb that is very strange indeed, I personally never encountered that error before.

ruthalee commented 3 years ago

Hello,

I initially installed phylophlan with conda, but got the phylophlan_configs folder does not exist error that others have gotten. I uninstalled phylophlan and then reinstalled by cloning from github and successfully ran the 4th tutorial, but I am still getting the same problem. Command: phylophlan -i ../Geobacter_faa -d phylophlan --diversity low -f ../phylophlan/default_configs/supermatrix_aa.cfg Error: [e] ../anaconda3/lib/python3.8/site-packages/PhyloPhlAn-3.0.2-py3.8.egg/phylophlan/phylophlan_configs/ folder does not exist I created an empty phylophlan_configs folder just to see what would happen and I no longer got that error, but now I am getting a new error: IsADirectoryError: [Errno 21] Is a directory: '../phylophlan/output_metagenomic_dists' Do you think this is an installation problem? Thank you!

Hocnonsense commented 3 years ago

Hello,

I initially installed phylophlan with conda, but got the phylophlan_configs folder does not exist error that others have gotten. I uninstalled phylophlan and then reinstalled by cloning from github and successfully ran the 4th tutorial, but I am still getting the same problem. Command: phylophlan -i ../Geobacter_faa -d phylophlan --diversity low -f ../phylophlan/default_configs/supermatrix_aa.cfg Error: [e] ../anaconda3/lib/python3.8/site-packages/PhyloPhlAn-3.0.2-py3.8.egg/phylophlan/phylophlan_configs/ folder does not exist I created an empty phylophlan_configs folder just to see what would happen and I no longer got that error, but now I am getting a new error: IsADirectoryError: [Errno 21] Is a directory: '../phylophlan/output_metagenomic_dists' Do you think this is an installation problem? Thank you!

Hello, did you run the command phylophlan_write_default_configs.sh?

ruthalee commented 3 years ago

I ran: phylophlan_write_default_configs.sh default_configs

thanks!

fasnicar commented 3 years ago

Thanks, that is strange and it seems to me that the

I created an empty phylophlan_configs folder just to see what would happen and I no longer got that error, but now I am getting a new error: IsADirectoryError: [Errno 21] Is a directory: '../phylophlan/output_metagenomic_dists'

could be raised by phylophlan_metagenomic and not phylophlan, is that the case?

It would be helpful for further understanding of this, if you can provide the version of the script (just use the --version param) and the full output using the --verbose.

Many thanks, Francesco

ruthalee commented 3 years ago

Thank you Francesco, here is the output: PhyloPhlAn version 3.0.64 (8 July 2021)

fasnicar commented 3 years ago

Thank you! That is the latest version, great.

From your previous message above:

Command: phylophlan -i ../Geobacter_faa -d phylophlan --diversity low -f ../phylophlan/default_configs/supermatrix_aa.cfg Error: [e] ../anaconda3/lib/python3.8/site-packages/PhyloPhlAn-3.0.2-py3.8.egg/phylophlan/phylophlan_configs/ folder does not exist I created an empty phylophlan_configs folder just to see what would happen and I no longer got that error, but now I am getting a new error: IsADirectoryError: [Errno 21] Is a directory: '../phylophlan/output_metagenomic_dists'

I don't think that error is from phylophlan, but I think is from phylophlan_metagenomic and it will be very helpful if you can provide both the full command line and the full output specifying the --verbose parameter.

Many thanks, Francesco

ruthalee commented 3 years ago

Thank you, Francesco! Here are my commands and outputs.

command: phylophlan --version --verbose
output:
PhyloPhlAn version 3.0.64 (8 July 2021) command: phylophlan_metagenomic --version --verbose output: phylophlan_metagenomic.py version 3.0.36 (3 February 2021)

ruthalee commented 3 years ago

Hi Franscesco, I got something to happen doing the following: -removed the git cloned phylophlan -reinstalled phylophlan with conda and put it in its own environment -put the git cloned example folder in anaconda3/envs/phylophlan/lib/python3.9/site-packages/phylophlan -put the files I wanted to run in example folder 04 -ran the following code inside the example folder: phylophlan -i Geobacter_faa -d phylophlan --diversity low -f default_configs/supermatrix_aa.cfg

I got this error: [e] ".../anaconda3/envs/phylophlan/lib/python3.9/site-packages/phylophlan/phylophlan_configs/" folder does not exists

Then it cleaned up my .faa files, mapped three of them, and on the fourth one produced the following errors:

[e] Command '['.../anaconda3/envs/phylophlan/bin/diamond', 'blastp', '--quiet', '--threads', '1', '--outfmt', '6', '--more-sensitive', '--id', '50', '--max-hsps', '35', '-k', '0', '--query', 'Geobacter_faa_phylophlan/tmp/clean_aa/Geobacter_sp_UBA9964.faa', '--db', 'phylophlan_databases/phylophlan/phylophlan.dmnd', '--out', 'Geobacter_faa_phylophlan/tmp/map_aa/Geobacter_sp_UBA9964.b6o.bkp']' returned non-zero exit status 1. [e] cannot execute command [e] error while mapping [e] gene_markers_identification crashed

Any idea what is happening here? Thanks so much for your help!

fasnicar commented 3 years ago

Thanks for reporting this. So the first error:

[e] ".../anaconda3/envs/phylophlan/lib/python3.9/site-packages/phylophlan/phylophlan_configs/" folder does not exists

is actually not a blocking error for PhyloPhlAn it is more like a warning and you see only that print there because all other prints are suppressed (can be enabled with the --verbose param).

The second error instead seems a bit strange considering that diamond ran for the other 3 proteomes. Can you try running the command (removing the --quiet) separately to see what error diamond returns:

../anaconda3/envs/phylophlan/bin/diamond blastp --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0 --query Geobacter_faa_phylophlan/tmp/clean_aa/Geobacter_sp_UBA9964.faa --db phylophlan_databases/phylophlan/phylophlan.dmnd --out Geobacter_faa_phylophlan/tmp/map_aa/Geobacter_sp_UBA9964.b6o.bkp

Many thanks, Francesco

ruthalee commented 3 years ago

Thank you for your help! I ran the command without --quiet. Command: ../anaconda3/envs/phylophlan/bin/diamond blastp --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0 --query Geobacter_faa_phylophlan/tmp/clean_aa/Geobacter_sp_UBA9964.faa --db phylophlan_databases/phylophlan/phylophlan.dmnd --out Geobacter_faa_phylophlan/tmp/map_aa/Geobacter_sp_UBA9964.b6o.bk

Output (last 10 lines, no prior errors showed up): Searching alignments... [0.657s] Processing query block 1, reference block 1/1, shape 16/16, index chunk 4/4. Building reference seed array... [1.732s] Building query seed array... [0.012s] Computing hash join... [0.388s] Masking low complexity seeds... [0.007s] Searching alignments... [0.655s] Deallocating buffers... [0.012s] Clearing query masking... [0s] Computing alignments... Error: generate_output: target with no hsps.

fasnicar commented 3 years ago

Thank you for reporting this. I have to say it is the first time I see that error from diamond and I wasn't able to find anything related to that (only the code that is generating it: https://github.com/bbuchfink/diamond/blob/master/src/align/output.cpp)!

I'm not sure how to debug this, would it be ok for you to share with me your input files so that I can run some tests myself?

Many thanks, Francesco

ruthalee commented 3 years ago

Thank you so much Francesco, I am emailing the 3 files that mapped along with the one that errored. I also tried running phylophlan with just the three files that mapped the first time and got the error that I have attached. Once again, thank you so much for your help!

Screen Shot 2021-10-21 at 1 37 16 PM
ruthalee commented 3 years ago

For anyone looking at this later, Francesco found it was a problem with DIAMOND and we got it to work by using DIAMOND version 2.0.7. Thank you Francesco!