phac-nml / staramr

Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Apache License 2.0
113 stars 26 forks source link

Feature/blast tabular #14

Closed apetkau closed 6 years ago

apetkau commented 6 years ago

This switches over BLAST to use the tabular format. It's meant to be merged in before https://github.com/phac-nml/staramr/pull/12 (which I will update to invert the direction of BLAST). Implements issue #10

I also re-named some of the getter/setter methods in AMRHitHSP and other classes from (database/query) to (amr_gene/genome). I do this so that, when inverting BLAST, I can just switch out which fields in BLAST the amr_gene/genome refer to in this class and don't have to worry about names not making sense.

I realize that there's more work I could probably do to simplify this (e.g., removing many of the AMRHitHSP classes and trying to do everything in a pandas DataFrame, but there's a lot of logic in those classes to group blast hits together based on coordinates, look up/parse out the specific codons/amino acids affected by a SNP, etc and I'm not sure how easy it would be to convert to using purely a data frame).