Closed alegione closed 4 years ago
Dear Alistair Legione,
Many thanks for reporting this.
One thing I noticed is that the database ID is not correct in your example, it should be SGB.Dec19
(you can obtain the list of the available databases with the --database_list
param).
So, I assume that the file phylophlan_metagenomic.txt
has been successfully downloaded, right?
If not you can download it using this URL:
https://www.dropbox.com/s/xdqm836d2w22npb/phylophlan_metagenomic.txt
You can get the 3 URLs f from this file for the SGB.Dec19
database, which are:
$ grep SGB.Dec19 phylophlan_metagenomic.txt
https://www.dropbox.com/s/l73jvga66ql4ows/SGB.Dec19.md5?dl=1 SGB.Dec19.md5
https://www.dropbox.com/s/djm9thsykn9h63s/SGB.Dec19.tar?dl=1 SGB.Dec19.tar
https://www.dropbox.com/s/dw947euykyjeee7/SGB.Dec19.txt.bz2?dl=1 SGB.Dec19.txt.bz2
You can manually download all 3 files with wget (please remember to remove the ?dl=1
from the URL) and save them into a folder of your choice.
Assuming now you downloaded the 3 files for the SGB.Dec19
release into the db
folder, you can run:
phylophlan_metagenomic -i input_folder -d SGB.Dec19 --database_folder db/
and phylophlan_metagenomic
should correctly detect that the database files are already available inside the db
folder and run without downloading anything else.
Please let me know if you should find any issues with these steps.
Many thanks, Francesco
Thanks Francesco, such a rookie error!
Going back through my history it seems I'd originally had the database name correct, but had only downloaded the tarball and not the txt file (and was getting an error on not having 3 urls from recollection), somewhere between downloading the txt file and retyping the command I'd switched to all caps for the database name (slap forehead)
Have fixed the spelling of the database and at the moment haven't encountered an error when using the same command structure as my earlier command. Looking forward to seeing the results
Thanks for picking up my problem!
Super, glad it helped. I'll close the issue then.
Hi Francesco,
I'm experiencing similar issue with the program trying to download the database files at every run.
I have tried to use --databases_folder
in my command, but it still seems to start with downloading, and about 8 out of 10 times, the download would fail probably because of connection.
Here is my command. I have copied the phylophlan database to a location, specified by --databases_folder
phylophlan \ --input_folder ./faa \ -o ./out \ --nproc 48 \ --diversity low \ -d phylophlan \ --databases_folder /home/Staff/uqgni1/tools/phylophlan/database/phylophlan \ -f /home/Staff/uqgni1/tools/phylophlan/phylophlan2_configs/protein-tree-updated.cfg \ --configs_folder /home/Staff/uqgni1/tools/phylophlan/phylophlan2_configs \ --submat_folder /home/Staff/uqgni1/tools/phylophlan/phylophlan2_substitution_matrices \ --maas /home/Staff/uqgni1/tools/phylophlan/phylophlan2_substitution_models/phylophlan.tsv \ -i wgt_v6
Forgive me for using some pp2 config files, they are working and I dare not to change them. But I'm happy to hear your suggestion though.
The last lines of the error message reads:
Downloading file of size: 0.00 MB 0.01 MB 2685.90 % 7.05 MB/sec 0 min -0 sec Downloading file of size: 64.05 MB [e] unable to download "https://www.dropbox.com/s/0h8ugr8hse4zmei/phylophlan.tar?dl=1"
What I want in the end is to tell phylophlan to use the database files I already downloaded.
Kind regards, Gaofeng
Thanks for the great looking tool. My question/issue relates to phylophlan metagenomic: Is there a simple way to download, extract, and point to the database manually?
Have spent the last day trying to get Phylophlan_metagenomic working but keep getting stuck with the database.
My cloud instance can't seem to complete the download from within the program without having the occasional connectivity drop and the download breaking, so I just keep having to restart and hope (so far to no avail)
I can easily manually download the .tar file with wget -c to avoid issues of connectivity loss, but then can't seem to find a way for the tool to see that the database exists
I've tried the following
phylophlan_metagenomic -i myfolder -o output-folder --nproc 8 -d SGB.DEC19 --database_folder place/with/the/database/
and get the following error
[e] invalid number of URLs for "SGB.DEC19" in the downloaded file
Looking at the code, I can see a check for whether the database exists, or if the md5 exists
both should be true (though technically the file is .tar so not sure if that would return true), but the program still runs the URL check and fails. Is there a means of downloading the database manually, set it up, and running the tool without having it try to download everything again?
I'm sure I'm missing something obvious, but just can't work it out