WrightonLabCSU / DRAM

Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
GNU General Public License v3.0
252 stars 52 forks source link

Suggestion: Decouple DRAM-v from VirSorter Requirement #35

Open btemperton opened 4 years ago

btemperton commented 4 years ago

When you ran DRAM-v.py did you provide the VIRSorter affi contigs tab file? Or did you not provide anything for the -v flag? If you didn't provide that then you will not be able to distill with DRAM-v although you could distill with DRAM. DRAM needs the VIRSorter affi contigs file in order to determine the auxiliary score.

Originally posted by @shafferm in https://github.com/shafferm/DRAM/issues/13#issuecomment-656883322

Many analyses now are combining multiple tools, e.g. VirSorter/VirFinder/metaViralSPAdes to pull out the viruses, followed by clustering into viral populations prior to analysis.

Needing the VIRSorter affi-contigs tab file to distill virome results places a major constraint on using DRAM-v to analyse these combined files and limits its scope to use with VirSorter2, which is starting to circulate.

The connection is also a little fuzzy, because VirSorter is using metageneannotator to do its protein predictions, but DRAM-v is using prodigal, so the genes in the affi-tab might not perfectly map to those found by DRAM-v.

I'd suggest finding an alternative route to determine auxiliary score within DRAM to make it usable across other current and future viral prediction tools.

shafferm commented 3 years ago

Totally! This is something we have begun exploration on how to do it. The problem is that we use VIRSorter's evaluation of what is a viral hallmark or viral like gene. VIRSorter has a custom HMM library to do this which I haven't found an easy way to pull to get this. What we are looking to test is that if the annotations that we are already getting from VOGDB are good enough for us to call replication and structure genes are hallmark and other viral genes as viral like. This is going to take some serious testing and I probably won't get to it until early next year when we start our next DRAM push.

All your suggestions and bug reports have been really insightful @btemperton!

btemperton commented 3 years ago

It's a great piece of software! I'd be surprised if Simon Roux can't just extract out the HMMs for VirSorter and send them to you.

Having the ability to add in your own HMM libraries would be an amazing addition. I've been thinking about if you wanted to screen MAGS/vMAGs for AMR genes, there's a couple of HMM datasets that would be super valuable.

shafferm commented 3 years ago

@btemperton I don't know if you had seen it but the VirSorter team added the ability to prep contigs for annotation with DRAM-v when using VirSorter2. It's in their readme here: https://bitbucket.org/MAVERICLab/virsorter2/src/master/

It's still not a full solution but at least now DRAM will work with VirSorter and VirSorter 2.