Closed fujch7 closed 2 years ago
No taxonomy is provided with viralrecall, because it is based primarily on HMMs. If you wanted to assign taxonomy to the NCLDV regions afterwards you could take the proteins predicted in the viral regions and search them against the proteins in the GVDB (https://faylward.github.io/GVDB/).
Hi, thank you for this tool. I also stuck at this step.
How can I do this? Do you also have a script to assign taxonomy to NCDLV?
Just to make sure we are talking about the same thing, to assign taxonomy confidently it is necessary to bin contigs and get draft genomes. Assigning taxonomy to contigs individually is very difficult- some are very short and lack marker genes, for example. There are many ways to do binning- simple tools like MetaBat2 actually do a fairly good job, but there are other alternatives (see https://merenlab.org/2022/01/03/giant-viruses/).
If you already have bins/genomes, then I would recommend making a phylogeny using ncldv_markersearch with your genomes together with references (https://github.com/faylward/ncldv_markersearch). The default options of this tool use 7 marker genes to make the concatenated alignment, and you can then use IQ-TREE to make the final tree.
Hi, Thanks for your amazing tool! I have successfully run this tool. But how could I get the taxonomy of NCLDV seqs? It's not found around the result files. Look forward to your favourable reply.