katholt / srst2

Short Read Sequence Typing for Bacterial Pathogens
Other
123 stars 65 forks source link

confusing or incorrect scoring leads to mis-called allele #141

Open mikeyweigand opened 10 months ago

mikeyweigand commented 10 months ago

I have one isolate of C. diphtheriae that produces an unexpected allele call for one locus. Below is a filtered subset of the scores file, which indicates that allele dnaE_1 is matched with highest coverage depth and no mismatches or indels, yet dnaE_35 receives the lowest score:

dnaE_35 0.1843841402268859 0.997 1.0 1.0 95.7627118644 354 2 0 15 1 0 1.0 1 1 0.010000000000000009 dnaE_34 0.23441588843224148 0.997 1.0 1.0 95.7627118644 354 3 0 15 1 0 1.0 1 1 0.010000000000000009 dnaE_36 0.23441588843224148 0.997 1.0 1.0 95.7627118644 354 3 0 15 1 0 1.0 1 1 0.010000000000000009 dnaE_39 0.23441588843224148 0.997 1.0 1.0 95.7627118644 354 3 0 15 1 0 1.0 1 1 0.010000000000000009 dnaE_1 0.28044602409771 66.068 44.5 37.5 100.0 354 0 0 0 NA 0.025974025974 0.025 1 40 0.15577799596720934 dnaE_33 0.281215297653878 0.997 1.0 1.0 95.7627118644 354 4 0 15 1 0 1.0 1 1 0.010000000000000009

The final mlst output includes: dnaE35*? dnaE_35/2snp15holes dnaE_35/edge1.0

Any ideas what might cause this (possible) error? I can't find any obvious clues in the logs and it only seems to occur in this one sample. Testing different versions of bowtie2 produce the same results. Thanks.