Open ccr25 opened 2 years ago
When you say a few thousand reads, out of how many? If it's 1,000 out of 1 million, for example, that's a 0.1% error rate, which is about as good as you can expect.
Are you enriching for a single bacterial genome, or multiple? Repeats and low-complexity sequences are the main cause of miss-classification for UNCALLED. Besides that, reducing the number of chunks to attempt mapping is the main way to improve precision, at the cost of sensitivity of course. Hope that helps!
Thanks, Sam
Hi,
Is it possible to increase the stringency of the chunk mapping to the reference for enrichment? We are getting a few thousands reads that map to our reference from UNCALLED but when we run the data through kraken we get very few accurate reads to our bacterial genome we are enriching for. I realize we need to allow for errors but can we adjust the number of mismatches allowed in a chunk? Thanks,
Chandler