Closed tedyun closed 5 years ago
Sorry about the travis failure, I need to fix the way the build works with docker (#149)
Just pulled/pushed your branch to launch https://travis-ci.org/dnanexus-rnd/GLnexus/builds/521600348
Thanks for catching this! Indeed none of our tests covered a case where min_AQ1>0, min_AQ2=0.
What do you think about just ignoring the min_AQ2 threshold if it's set to zero || (cfg.min_AQ2 && al.second.topAQ.V[1] >= cfg.min_AQ2)
? This would differ from your solution only in cases where we have two or more allele carriers all with AQ identically zero (doesn't worry me much).
I'm slightly nervous about using copy_number
which is estimated in a different way since it depends on the genotype zygosity (1 or 2 copies in each carrier). We can contrive an example such as a single 1/1 genotype leads to copy_number=2
but the topAQ vector still has only one entry.
Hi Mike, thank you for your quick response! Ah I see - I didn't know about the complexity about copy_number
variable.
I'm not sure if I completely understand your suggestion though.. For example, I'm thinking about the case when min_AQ1=30 and min_AQ2=0. If there is an allele that has multiple copies and if its AQs are identically 20, wouldn't that be filtered out by that logic? (because al.second.topAQ.V[0]
will be less than 30 and the clause after ||
will always be false.) Maybe I'm misunderstanding what al.second.topAQ.V
vector keep track of.
Btw, my coworker had another one-liner solution which is changing the default values of the vector here to -1
instead of 0
here https://github.com/dnanexus-rnd/GLnexus/blob/e25a6be613fbd1879477d62c7258a2dcfe8dd896/include/types.h#L291 I felt nervous about changing the default because the vector is being used a lot, but please let me know if you think that'll work.
Yes, I was not thinking clearly with my last suggestion. I like the -1 idea. Let me noodle a little bit if there may be unintended consequences.
Yep thank you. I updated my branch with the -1 solution.
Continues on #167. Thanks!
Totals | |
---|---|
Change from base Build 1953: | 6.5% |
Covered Lines: | 80 |
Relevant Lines: | 80 |
My name is Ted Yun and I'm in Genomics team at Google. My coworkers and I have found that when GLnexus is run with the two different set of parameters:
min_AQ1 = 30, min_AQ2 = 0
andmin_AQ1 = 0, min_AQ2 = 0
, it produces identical results.I believe the issue is coming from this part of the code where
check_AQ
always returns true whencfg.min_AQ2 = 0
(regardless of thecfg.min_AQ1
value), becauseal.second.topAQ.V[1]
value is initialized to zero. The suggested changes here check the second inequality only if the copy number of the allele is at least two, which I believe is the definition of the min_AQ2 config.Please take a look. Thank you!
Regards, Ted