MEGARes and AmrPlusPlus - A comprehensive database of antimicrobial resistance genes and user-friendly pipeline for analysis of high-throughput sequencing data
Hello! Thank you very much for curating AmrPlusPlus and for managing this GitHub so closely -- your documentation and your responses to the existing issues have helped me a lot! I have been able to get AmrPlusPlus to successfully run on my samples, which is great. However, as I was compiling the output data for my analysis, I noticed that there are a couple of discrepancies between the different output files (group.tsv, type.tsv, class.tsv, in particular).
For example, the group.tsv for one of my samples shows this:
(This is just a header)
Sample Group Hits
ER0331.amr.alignment.dedup Aminocoumarins,Aminocoumarin-resistant DNA topoisomerases,PARE 723
ER0331.amr.alignment.dedup Aminoglycosides,16S rRNA methyltransferases,RMTF 204
ER0331.amr.alignment.dedup Aminoglycosides,Aminoglycoside N-acetyltransferases,AAC3 21
I noticed that the "Group" is actually a comma-separated list of the Type, Class, and Group. This was easy enough to address, but when I consolidated the read counts for all "Aminocoumarins" or "Aminoglycosides" and their respective classes in this example, those totals do not match the numbers output in the type.tsv and the class.tsv for this sample.
I was wondering if this issue has been raised by anyone else and, if so, if this is normal. Which values would you recommend using in this case (the class and type values from the group.tsv which have been consolidated, or those from the separate type.tsv and class.tsv files?
Relatedly, I noticed that my mechanism.tsv files are formatted oddly:
The sample name and group are listed on one line while the entire gene and read count value are on the second -- is this how the mechanism.tsv is supposed to be written? I noticed this issue when I was trying to merge my sample files into a comprehensive sheet and was met with some grumpy error code.
I'm not sure if these discrepancies are something to worry about or if this is an indication that my install/run of AmrPlusPlus was faulty/corrupted somehow, but I thought I would reach out to see if you had any advice for moving forward. Thank you very much for your time!
Hello! Thank you very much for curating AmrPlusPlus and for managing this GitHub so closely -- your documentation and your responses to the existing issues have helped me a lot! I have been able to get AmrPlusPlus to successfully run on my samples, which is great. However, as I was compiling the output data for my analysis, I noticed that there are a couple of discrepancies between the different output files (group.tsv, type.tsv, class.tsv, in particular).
For example, the group.tsv for one of my samples shows this: (This is just a header)
I noticed that the "Group" is actually a comma-separated list of the Type, Class, and Group. This was easy enough to address, but when I consolidated the read counts for all "Aminocoumarins" or "Aminoglycosides" and their respective classes in this example, those totals do not match the numbers output in the type.tsv and the class.tsv for this sample.
I was wondering if this issue has been raised by anyone else and, if so, if this is normal. Which values would you recommend using in this case (the class and type values from the group.tsv which have been consolidated, or those from the separate type.tsv and class.tsv files?
Relatedly, I noticed that my mechanism.tsv files are formatted oddly:
The sample name and group are listed on one line while the entire gene and read count value are on the second -- is this how the mechanism.tsv is supposed to be written? I noticed this issue when I was trying to merge my sample files into a comprehensive sheet and was met with some grumpy error code.
I'm not sure if these discrepancies are something to worry about or if this is an indication that my install/run of AmrPlusPlus was faulty/corrupted somehow, but I thought I would reach out to see if you had any advice for moving forward. Thank you very much for your time!