Closed brentp closed 4 years ago
Thank you Brent! We started an internal discussion about it. Will keep you posted.
I ran make_examples with dad.sam.gz and recorded output of the realigner (upper part is original BAM and lower part show realigned reads). You are right Brent, realinged reads are mapped to support 1 del and 1 ins. But in the end the haplotype would be the same with 1 DEL and 1 INS or 3 SNPs, it is just a different representation. This behavior depends on penalty scores that we use for Smith–Waterman alignment.
And child reads should have been aligned the same way as dad's. But we have a special algorithm that decides whether to run realignment. In the case of a child this algorithm decided against realignment. This algorithm can be turned off in which case reads will be realigned for all samples the same way. In order to do that make_examples needs to run with --nows_use_window_selector_model flag. Below is the example command I used to run make_example:
make_examples \ --ref GRCh38.genome.fa \ --reads kid_sorted.bam \ --examples /tmp/examples.tfrecord.gz \ --mode calling \ --regions chr8:75144950-75145050 \ --nows_use_window_selector_model
Thanks @akolesnikov !
@brentp if you're using run_deepvariant
to run, you should be able to use
--make_examples_extra_args "ws_use_window_selector_model=false"
hi, thanks for the replies. so --make_examples_extra_args "ws_use_window_selector_model=false"
(since I am using run_deepvariant
) will accomplish the same as what @akolesnikov suggests with --nows_use_window_selector_model
?
and I don't see much documentation on that option, is there any downside to that? (I assume it just affects run time?)
Yes, if you use run_deepvariant
(which is a wrapper for the 3 separate steps), the *_extra_args
flags are just a more flexible way to allow you specify flags for each step.
In run_deepvariant
, we hard-coded some of the commonly used flags but not all of them. For example, you can't directly specify --ws_use_window_selector_model
to run_deepvariant
.
We thought people might eventually have use cases for all other less known flags. (Which is why the *_extra_args
flags exist, but haven't been advertised or documented other than just the flag description.
make_examples_extra_args
description can be found here:
https://github.com/google/deepvariant/blob/97cd861800ccb43d750f392b518e99d514adddd8/scripts/run_deepvariant.py#L92-L96
Regarding the downside of turning off ws_use_window_selector_model
in make_examples -- yes, it'll be slower. It's a small model where we used to decide whether we need to realign a window or not. When we set it to default, we found that it significantly decreased runtime, with negligible trade-offs.
You can find the description of this feature when it's first released in v0.7: https://github.com/google/deepvariant/releases/tag/v0.7.0
"Changed window selector to use a linear decision model for choosing realignment candidates. This can be controlled by a flag. -ws_use_window_selector_model
which is now on by default."
@pichuan Is this option gonna increase accuracy and sensitivity in general?
@kokyriakidis to your question, having ws_use_window_selector_model was to improve runtime, not to improve accuracy or sensitivity. But empirically we expect the trade-off on accuracy to be small. Below you can see my comparison between turning it on/off for WGS and WES on v0.9. (On PACBIO setting, this doesn't affect anything because we don't run realigner for PACBIO.)
I ran some numbers based on the v0.9 WGS and WES case study data. I used this type of CPU machine for the runtime.
The following results were done in two settings:
BASE
: This is the default (i.e., ws_use_window_selector_model
is true)
EXPT
: Turn off window selector (i.e., set ws_use_window_selector_model
to false)
Settings | make_examples runtime |
Indel F1 | SNP F1 |
---|---|---|---|
BASE |
81m29.320s | 0.998112 | 0.999633 |
EXPT |
107m3.893s | 0.998156 | 0.999642 |
Settings | make_examples runtime |
Indel F1 | SNP F1 |
---|---|---|---|
BASE |
10m5.515s | 0.973295 | 0.999318 |
EXPT |
19m25.869s | 0.974056 | 0.999362 |
For the detailed hap.py output, you can see here.
Hi @pichuan ,
Thank you very much for this wonderful analysis! Turning it off MAY be worth the increased runtime in cases where someone is looking for rare SNPs in a population or in a clinical environment where you prioritize accuracy over runtime.
I re-ran DV+glnexus on a cohort of ~150 WGS trios. Using --make_examples_extra_args "ws_use_window_selector_model=false"
does reduce the apparent mendelian violation rate. Here is the plot of that before (left) and with that argument (right). This is after some sane filtering on GQ and rare in gnomad.
The right side is lower; good. But, then I further filtered that set of variants on your recently released thousand genomes DV VCF, requiring an allele frequency less than 1% there. Now, even though the RHS as dropped by ~20 DNs per trio, the "before" LHS, has a lower number of putative de novos and the end result is that it is worse to run with ws_use_window_selector_model=false
This must be because you ran the thousand genomes samples without turning off the window selector and I am annotating on exact REF and ALT allele matches, not using haplotype enumeration as does hap.py, but AFAIK, this is how all (allele frequency) annotation software will work.
Any recommendations on how to proceed? I did bcftools norm on my set and on the deep variant thousand G calls and I suspect something like vcfallelelicprimitives will not help here either. A thousand G callset run without the window selector would help, but is only a band-aid. Is there an annotation software that does haplotype matching? Presumably this will be more of a problem as the GATK juggernaut fades.
Hi @brentp
This is a very good question, but I am not sure I have a good answer for you. For you and @kokyriakidis the fact that window selector can cause inconsistency across a pedigree was not something we previously appreciated when considering the trade-offs between speed and accuracy. In the intermediate future, we hope to profile performance of make_examples to improve speed and at that time we will revisit the decision to enable this by default. This will not occur for the upcoming release, but possibly the following one.
It is possible to create a BED file covering de novo sites and force window selector=false across all of them, this should be fast and it could be an exercise we would want to do on the 1KG pVCF.
We are also working on some more involved methods for calling in a trio, but that is also in the intermediate timescale.
Hi Andrew, thanks for the reply. I hadn't previously appreciated the extent of this problem before either--not just in DV, but everywhere. The "kid" and "dad" variants I posted above can be made to be the same variant-set if we know they are on the same haplotype--and running with the window selection off does this, but it still creates a problem with annotating across different call-sets when different representations are used. gnomad prefers the representation of an insertion and a deletion rather than the 3 individual SNPs.
So, even if this is further resolved in DV (running with ws_use_window_selector_model=false
is sufficient for me), then it will be, in cases like this, impossible to correctly annotate across cohorts without phasing information.
edit: feel free to close this issue as my original issue is addressed.
I can open a separate issue if it's helpful, but just a couple more things related to this...
First, while the haplotype stuff like in the images above is mostly gone with ws_use_window_selector_model=false
, I still see the problem in some false positive calls.
Another thing that happens with things that DV calls de novos but obviously are not is that the kid will just meet some threshold and have a number of MQ ~40 reads with the de novo, where as the parent will have a number of reads with the allele that are MQ ~18 or lower. But the VCF reports AD[1] == 0 for many of these in the parent. If the count of low-quality alleles were reported in the sample fields in the VCF, it would be simpler to filter to make sure the allele was absent from the parent.
Hi @brentp @kokyriakidis ,
We (genomics team in Google Health) have released DeepVariant 0.10 on 3/26, which includes improving consistency and accuracy by turning off ws_use_window_selector_model
by default, along with the following other changes:
For more information, please read our release notes: https://github.com/google/deepvariant/releases/tag/v0.10.0 If you have any feedback about your experience with v0.10, please let us know.
Thank you!
hi, I am using DV 0.9 to find de novo variants in trios and so I am enriching for weird stuff. I am sure your team is aware of some/all of these, but I'll document at least 1 case here for the record.
Nearly all obvious false positive de novo calls follow the same pattern. Mostly there is a haplotype from mom , and a haplotype from dad that are different. The kid inherits both the variable haplotypes, but the combination of DV and glnexus gentoype such that the variant appears as de novo.
But, here is an example where there is just an incorrect call from DV that leads to 3 neighboring spurious de novo calls in the kid (top row):
note that dad in the 3rd row has a single read with a 1-base deletion (dash) followed by an insertion (purple tick). I can't show all of the reads in this image, but I have scrolled through and verified that is the only read.
here is the content of the dad's VCF for that region (the mom's is actually very similar):
note that that is the single-base del and the insertion that occurs in only 1 read. Since the mom's is the same, maybe there's something akin to realignment going on, but by contrast, here is the kid's (seemingly more sensible) VCF for that region:
here is the content of the gvcf for dad:
and kid:
I am attaching a small sam for kid and dad aligned to hg38 kid.sam.gz dad.sam.gz
I have other scenarios, but this one is one that seems clearly a deep variant issue and not a problem with glnexus.