Closed jts closed 9 years ago
BLOSUM was trained on conserved regions of genomes, not sequencing data, so neither the nucleotide nor amino acid scoring scheme is a good fit for nanopore data. I tested this and found that the (default) amino acid scheme performs better (in terms of correction accuracy) than the nucleotide scheme.
I'm closing this for now but training a new scoring scheme specific to basecalled nanopore data is something we should do in the future.
Hah! That makes me sad and scared. It drives me absolutely bonkers when I find and fix an egregious bug and get worse performance. Wouldn't be the first time.
as per @sjackman on twitter
https://twitter.com/sjackman/status/568835294193000448