Closed KatherineJ-H closed 6 years ago
Katherine,
The resistance database which is based on ResFinder and CARD is the one included here with SRST2: data/ARGannot.r1.fasta
That file is kept up-to-date (we add new resistance genes periodically) and it's already correctly formatted. So just using that would be the easiest option - no clustering required, just give it to SRST2 with the --gene_db
option.
If you are instead specifically interested in the CARD.fa
resistance genes, then yes, you'll have to cluster them first and format them for SRST2 using the instructions here.
Let me know if that sorts it out for you or if you have any other questions!
Ryan
Also, note that CARD includes many genes that are core chromosomal genes not acquired resistance genes... so you will get hits for the chromosomal genes in every isolate you type, even if the allele present is not resistance-related. This is fine as long as you understand the underlying database you are working with and interpret it accordingly. Most people using SRST2 are doing so because they want to identify acquired resistance genes/alleles, for which we recommend our pre-formatted database ARGannot.r1.fasta which is based on the ARG Annot database with some additions from ResFinder and CARD (but only the acquired genes).
I have been trying to run the resistance script using the CARD database, however it always results in an empty results file. I obtained the CARD.fa from https://card.mcmaster.ca/download/0/broadstreet-v1.1.0.tar.gz. Should I run this (once extracted) through the steps outlined for clustering etc. I also noticed that it was stated that the preliminary resistance data was based on the ResFinder database and CARD, so if I have run SRST2 using the resfinder.fasta, would that have included the CARD resistance data set also. Thank you for your time.