Dennis-xyHuang / PhyloPlus

MIT License
2 stars 0 forks source link

TaxID #1

Closed oclaisse closed 1 year ago

oclaisse commented 1 year ago

I submitted the following TaxID file from metaphlan and here is the error message I received below TaxID1.txt TaxID1.txt

Sorry, there was an error generating your PhyloPlus results. The error messsage was: Error: Command failed: python3 lib/scripts/extract_lineage_user_provided.py "tmp/output-tmp-7-1674053303646/TaxID1.txt" /tmp/tmp-356076-6ArGbl88KYaH "olivier.claisse@inrae.fr" lib/NCBI_dmp_files Traceback (most recent call last): File "lib/scripts/extract_lineage_user_provided.py", line 204, in user_summary_2 = convert_taxid_to_lineage_main(uniq_user_taxids_rmng) File "lib/scripts/extract_lineage_user_provided.py", line 56, in convert_taxid_to_lineage_main uniq_sci_name[taxid] = temp_dict["ScientificName"] TypeError: 'NoneType' object is not subscriptable Please contact JIFSAN IT for support.

TaxID1.txt

Dennis-xyHuang commented 1 year ago

@oclaisse Hello, I tried to reproduce the result using the input you provided, and found out that the error was caused by some of the taxIDs that could not be found in NCBI databases: e.g., 6115856, 7415705, etc. I have modified the script and updated the web portal. These erroneous taxIDs will be recorded in the note.txt file in the final output.

I also noticed that some of the taxIDs in the input belong to higher levels of taxonomic ranks such as class. Please note that these taxIDs will also be filtered out and won't be added to the phylogeny either, such that the taxonomic ranks of tip labels in the final phylogeny are coherent and all pairwise distances represent interspecific distances.

Please let me know if this error still persists for the web portal.

oclaisse commented 1 year ago

Hello, I try again with the same file and below the new error message:

Sorry, there was an error generating your PhyloPlus results. The error messsage was: Error: Command failed: Rscript lib/scripts/expand_phylogeny.R lib/input/GTDB_r207_bacterial.tree /tmp/tmp-356076-7yZdJdF7TTaT 1 2 2 0.75 If you use the ggtree package suite in published research, please cite the appropriate paper(s):

Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Molecular Biology and Evolution. 2018, 35(12):3041-3043. doi:10.1093/molbev/msy194

Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution. 2017, 8(1):28-36. doi:10.1111/2041-210X.12628

Attaching package: ‘tidytree’

The following object is masked from ‘package:stats’:

filter

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Loading required package: Rcpp Loading required package: ape Loading required package: maps treeio v1.18.1 For help: https://yulab-smu.top/treedata-book/

If you use treeio in published research, please cite:

LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Molecular Biology and Evolution 2020, 37(2):599-603. doi: 10.1093/molbev/msz240

Attaching package: ‘treeio’

The following object is masked from ‘package:phytools’:

read.newick

The following object is masked from ‘package:ape’:

drop.tip

Killed Please contact JIFSAN IT for support.

Best, Olivier

Dennis-xyHuang commented 1 year ago

Hi Olivier @oclaisse,

I rerun the script on the server and it was due to one of the taxIDs which could only be mapped at the phylum level, so a large distance was extracted during execution and the server run out of memory. We have upgraded the server to another instance with larger RAM and I uploaded your input again and this time I successfully got the output.

Would you mind shoot me an email at xhuang96@umd.edu, just in case if this error happens again so I can send you a copy of your output. Also, I plan to update the method a little bit so I can utilize more information to better locate these rare species, such as the one in your input. I have done some preliminary test and the new method seem to work. The testing and implementation of the new method to the web portal may take 1~2 weeks. If you can send me an email, I would be more than happy to keep you updated and also seek any potential suggestions : )

Best, Dennis

oclaisse commented 1 year ago

Thank you very much it works well now! Best, Olivier

Dennis-xyHuang commented 1 year ago

@oclaisse Hi Olivier, I made changes to the original method, as I noticed that the previous run for your input placed some of the species at relatively high level such as class, order, etc. Now the newer version incorporates more information from the reference and these species can now be located more precisely.

Also I noticed that last time your input contained taxonomy IDs of different levels, including genus taxIDs, family taxIDs, etc., so I also made it available that the user can decide what taxonomic rank they want to have in the final result. For example, now you can choose "family" with the same input, and all tip labels in the final result will be family names found in your input.

The updates have already been made to the portal, but not the repo here yet. Please check it out if you are interested : )

Best, Dennis