phac-nml / biohansel

Rapidly subtype microbial genomes using single-nucleotide variant (SNV) subtyping schemes
Apache License 2.0
25 stars 7 forks source link

Fix #98 QC check for mixed subtypes #100

Closed peterk87 closed 5 years ago

peterk87 commented 5 years ago
peterk87 commented 5 years ago

Hi @glabbe

With the changes in this pull request I am able to get the proper QC messages using the schema you linked to in #98


For ERR163996, I now get a QC FAIL with the message:

FAIL: Mixed subtype; the positive and negative kmers were found for the same target site 62657 for subtype "4.1.2".

Command-line:

$ hansel -s tb_speciation_scheme_v1.0.5.fasta -p ERR163996_* -O output.tsv --min-kmer-freq 8

For ERR182041, I now get a QC FAIL with the message:

FAIL: Mixed subtype; the positive and negative kmers were found for the same target site 4260268 for subtype ""4.6.1.1"".

Command-line

$ hansel -v -s tb_speciation_scheme_v1.0.5.fasta -p ERR182041_*.fastq.gz --min-kmer-freq 8 --force

I'll continue testing the other genomes next week!

glabbe commented 5 years ago

Great! Thanks a lot @peterk87