Closed wangpeng407 closed 4 years ago
Hi,
Thanks for reporting this. The errors you're seeing (at least some of them) might not be due to bad/unstable connection, but might be due to the fact that the IDs changed in UniRef and hence those IDs cannot be downloaded directly.
Those errors are stored by phylophlan_setup_database
and will be re-tried right after by using the Uniref's APIs to resolve the UniRef90 IDs into the newer ones.
If also the second attempt to download the UniRef90s failed for some of the IDs, you'll find in the output folder a file named <taxonomic_label>_core_proteins_not_mapped.txt
which lists the UniRef IDs that were not possible to download.
Many thanks, Francesco
Dear Francesco Thanks for the brilliant tool. When running the test command
phylophlan_setup_database -g s__Staphylococcus_aureus -o 01_saureus --verbose
, it sometimes works wrong due to the bad or unstable network speed, warnings like this:So is it possible to modify the script to extract sequences from the dowloaded ”uniref90.fasta.gz“ according to uniref ID? If possible, I think it is more convenient and user-friendly.
Thank you very much.