broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.68k stars 587 forks source link

What will be the Ideal Minimum base quality to count reads GATK ASEReadCounter #6753

Closed Hemantcnaik closed 4 years ago

Hemantcnaik commented 4 years ago

Hello I am using GATK ASEReadCounter for this one of the option --min-base-quality 0 zero is defualt, any one suggestion value will be ideal. I have illumina smart-seq2 reads

ldgauthier commented 4 years ago

Hi @Hemantcnaik ,

This is a much better question for the GATK forum. We try to limit github issues to bugs and feature requests, while questions about tool usage should go to the forum where they're triaged by another team.

We haven't done much RNA development in a long time, though hopefully that's about to change. In the meantime, a lot of other places in the GATK code we use 20 as a minimum base quality. If you're really concerned about optimizing this value, you could probably come up with an in silico experiment similar to what we do for BQSR. If you assume that every site in your sample that's in dbSNP is a real variant and everything that's not is a false positive (not true, but to a good approximation), then you can look at the base quality distributions for true positives and false positives and try to decide what's a good tradeoff between sensitivity and precision.

droazen commented 4 years ago

Closing -- moved to the forum