Open ShailNair opened 1 year ago
Very interesting, I will assign this to our viral team. But first, here are some quick thoughts:
Do consider sharing your data if you are able this will make DRAM stronger. These are just some quick notes you may want to mull over if you have not already expect a longer discussion in the future.
Hi, thank you for the prompt reply. Yes, I did check the AMG database, and the AMGs I am focussed on are not included there. I have attached the raw files and the dramV annotations of the probable AMGS here. For your reference, I've also provided additional genes (genes before and after the likely AMG), although my question only pertains to genes with the following kegg orthologs (KO): K16566 K16557 K16554 K16556 K16558 K16555 K16568 K16564 K16560 K16563 All of which are involved in the Exopolysaccharide biosynthesis process.
Thank you dramv.output.zip
Thanks! We will look to add them to AMG database. If you want to add them your self and do a pull request, then you can get the credit on GitHub for your work and can say you are a contributor to DRAM. But I would understand if you wanted to wait until we confirm these to do so. Of course, we will also look into more broad ways to improve the amg database also.
I'll wait because I have to conduct a deeper analysis to be confident that these are potential AMGs. Also, it is worthwhile to hear from you, once you have explored the provided files. Thank you for everything you and your team are doing with DRAM.
Hi, and many thanks for the excellent annotation tool. I have a query regarding viral AMG annotation. I have a bunch of viral contigs identified through metagenomics. The contigs were cleaned using Checkv and prepared for DRAM-V using virsorter2. I found some of the probable AMGs in the main annotation.tsv (not distilled) with virsorter category score of 0-2, and an auxiliary score less than 4, but without any AMG flags. I understand that the lack of AMG flag may be the reason they were not included in the final distillate due to the limitations of the AMG database used by DRAMV. l also looked for viral-like genes in these contigs using the pVOGs, RefSeq-viral, and PHROGS databases with an e-value cutoff of <1E-5. However, I am still unsure whether these genes should be classified as AMGs or not, as some of them appear to meet the criteria for AMG classifications except the missing AMG flag. Here is my annotated figure:
Would be happy to know others' insights. I am mostly interested to know what if a virus contig contains a gene cluster (participating in one function) of more than three genes as in the case of contig A in the above picture. Here, the contig A has a size of more than 150 kb (which falls into the jumbo phage category), is complete, and is without contamination (checkV criteria).