katholt / srst2

Short Read Sequence Typing for Bacterial Pathogens
Other
125 stars 65 forks source link

Discrepancy between top scoring allele in result file and consensus sequence file #39

Closed alex-m-a closed 8 years ago

alex-m-a commented 9 years ago

There is a discrepancy between the top scoring allele indicated in the result file versus the consensus sequence file (using --report_all_consensus) when the "mlst_allele/trun" warning is called in the mismatches field.

I believe this is caused by the arguments given to the create_allele_pileup function (specifically top_allele) which does not take into account next_best_allele that is used for reporting when a truncation override is detected.

My apologies if this post is bad form, as I am absolutely brand new to git!

ramjet10 commented 9 years ago

I have found the same issue (well I at least think is the same issue). I am using a custom MLST database and when I use the --report_all_consensus feature the allele sequences that are returned frequently contain deletions when compared with the database alleles. When I assemble the genome using SPAdes and then inspect the loci, there are no deletions. Also the deletions often disrupt the reading frame of essential genes and so are not really possible. I am also new to this so forgive me if I have not provided enough info (or appropriate info).

katholt commented 8 years ago

Thanks for pointing this out, this is now fixed.

ramjet10 commented 8 years ago

Thank you! Will check it out when exam marking is over/