Open Anto007 opened 4 years ago
Also, I think it would be very convenient for users to directly have the sequence files that are relevant for all complete circular phages detected by VIBRANT (i.e., 'VIBRANT_complete_circular_metagenome.tsv')
Hi,
Predicting phage taxonomy is not simple and in some cases the distinction between different groups is not defined well. I am currently working on a phage taxonomy tool as a side project but the implementation into VIBRANT will likely not be soon. It's also difficult to quickly assess if the phage is novel or not because the databases for viruses are very large combined (NCBI, IMG/VR, GOV2, single publications, etc.). A tool that currently exists for this is vConTACT2 which relies on reference viruses mainly (i.e., NCBI RefSeq). vConTACT2 was used in the VIBRANT pre-print manuscript.
I'll consider the new file for circular viruses, it would be fairly easy to implement. I had another question about adding a file for prophage coordinates. I'm going to wait for any other necessary updates and put all these suggestions together in a single update. This will likely be in a couple weeks. Likely just minor enhancements like this and not anything with the method of virus identification.
Kris
Thank you for your quick response. I agree with you on phage taxonomy and so no major worries there. I look forward to seeing the new updates.
The pVOGs used in VIBRANT don't they give a taxonomic annotation?
Hi Silas,
One of the most used method for host prediction is to use percent identity of phage proteins to known reference phages. Then if you get hits to a known phage you likely know the host (based on the host of the reference phage). VIBRANT specifically uses HMMs, which is different than BLAST against a protein database. HMMs may contain info from diverse phages that do not share the same taxonomy or host. That means VIBRANT can't use taxonomic info from VOGs in this case (VIBRANT doesn't use the pVOGs database). I'm working on putting together a quick taxonomy prediction tool using reference phages that will do taxonomy annotation to the Family level.
Kris
Hi @KrisKieft! Really interested in using VIBRANT in some of our labs studies. I am curious as to what output files from VIBRANT you used as input to vContact2? Any tips are greatly appreciated.
Scott
Hi Scott,
I have now added a new script to the scripts/
folder. A full explanation can be found in the updated README at the top (Content Addition). This script can be used to reformat VIBRANT protein outputs for vConTACT2. The specific file that you want to use is combined_phages.faa
in your phages
output folder. From there you need to make a gene-to-genome file for vConTACT2 and under the type of proteins you input you want to select Prodigal format. Hope that helps.
Kris
Thanks so much @KrisKieft. I will have a chance to work with the script later this week. Many thanks in advance!
Scott
Hi @KrisKieft,
I'm trying to run some VIBRANT output through vConTACT2 and I'm curious which VIBRANT output to use to generate the gene-to-genome file. I thought I saw a file with the associations with the HMMs, but now that I'm going to look for it I don't see anything like that. Could you clarify which output to use for this step?
Thanks, Samantha
Hi Samantha,
Can you please open a new Issue for this question? Under the Issues tab there's a green "New Issues" button. This will help with organization of specific questions because I've had this one before regarding vConTACT2. You can simply copy and paste your question. Once that is up I'll post my answer for clarity. Thank you for your cooperation.
Kris
Hi, I was wondering why VIBRANT doesn't output the taxonomic identities of predicted phages (should they be not novel ones)? In case VIBRANT does this, where can I find this info in the output?