Open btemperton opened 4 years ago
Totally! This is something we have begun exploration on how to do it. The problem is that we use VIRSorter's evaluation of what is a viral hallmark or viral like gene. VIRSorter has a custom HMM library to do this which I haven't found an easy way to pull to get this. What we are looking to test is that if the annotations that we are already getting from VOGDB are good enough for us to call replication and structure genes are hallmark and other viral genes as viral like. This is going to take some serious testing and I probably won't get to it until early next year when we start our next DRAM push.
All your suggestions and bug reports have been really insightful @btemperton!
It's a great piece of software! I'd be surprised if Simon Roux can't just extract out the HMMs for VirSorter and send them to you.
Having the ability to add in your own HMM libraries would be an amazing addition. I've been thinking about if you wanted to screen MAGS/vMAGs for AMR genes, there's a couple of HMM datasets that would be super valuable.
@btemperton I don't know if you had seen it but the VirSorter team added the ability to prep contigs for annotation with DRAM-v when using VirSorter2. It's in their readme here: https://bitbucket.org/MAVERICLab/virsorter2/src/master/
It's still not a full solution but at least now DRAM will work with VirSorter and VirSorter 2.
Originally posted by @shafferm in https://github.com/shafferm/DRAM/issues/13#issuecomment-656883322
Many analyses now are combining multiple tools, e.g. VirSorter/VirFinder/metaViralSPAdes to pull out the viruses, followed by clustering into viral populations prior to analysis.
Needing the VIRSorter affi-contigs tab file to distill virome results places a major constraint on using DRAM-v to analyse these combined files and limits its scope to use with VirSorter2, which is starting to circulate.
The connection is also a little fuzzy, because VirSorter is using
metageneannotator
to do its protein predictions, butDRAM-v
is using prodigal, so the genes in the affi-tab might not perfectly map to those found byDRAM-v
.I'd suggest finding an alternative route to determine auxiliary score within DRAM to make it usable across other current and future viral prediction tools.