Closed Iris7788 closed 1 year ago
Hi, you can get the lineage information from the names/nodes files in the taxonomy subfolder. These files are in the standard format for names/nodes files for Kraken2 databases - e.g., see this link - https://github.com/DerrickWood/kraken2/issues/436. Please see the methods section of our paper for how taxonomic assignments were made for each viral genome. Hope this helps!
Thank you for your kind reply. I also want to know if phanta will provide host, and lifestyle information for predicted phage.
Hi, no problem. Yes, the host and lifestyle information is contained within the DB as species_name_to_vir_score.txt and host_prediction_to_genus.tsv. Virulence scores for phage species are between 0 and 1 and the host predictions are to the prokaryotic genus level.
hi,
I have downloaded the unmasked_db_v1. I would like to know the viral species covered in the database. I noticed that the file “seqid2taxid.map” in the database folder can map genome names to taxa IDs. However, I found that the taxa IDs corresponding to genomes from MGV cannot retrieve the corresponding lineage information on the NCBI website. is there a solution to get it?