Closed KickiLagerstedt closed 1 year ago
@helena.malmgren has seen the same regarding F0014934-2, ALMS1-gene
For those 2 cases you posted above the variant is shown in the squished view:
I'm not sure why, could be due to downsampling?
Hi, for starters, the allele counts in Scout are as reflected on the input VCF. We also see traces of the variants called on the IGV.js view, but nowhere near either the depth or evenness hinted by the AD and DP, but certainly not the variant Quality.
I do not see any downsampling indicator for these regions in IGV.js, but lets give it a quick look in desktop IGV before passing on to MIP. At least the FLG2 is a rather repetitive story, so DeepVariant may be doing something creative there, with comparing different regions at one go, but it is not something I was aware of.
I do not see any downsampling indicator for these regions in IGV.js, but lets give it a quick look in desktop IGV before passing on to MIP. At least the FLG2 is a rather repetitive story, so DeepVariant may be doing something creative there, with comparing different regions at one go, but it is not something I was aware of.
No I agree, it's not enough reads for downsampling. I don't understand
I can check in desktop IGV, but given that there are known repeats here, and that the alignments look rather messy and with low quality variants, some SA and XA tagged reads etc, I suspect we are dealing with an issue where DeepVariant through local realignment has brought in more reads than were originally at this loqus with bwa (which is what we see in the cram file, and visualise with IGV).
Compare perhaps e.g. this issue: https://github.com/google/deepvariant/issues/618, and the FAQ, https://github.com/google/deepvariant/blob/r1.5/docs/FAQ.md, especially the diff between bwa aln and realignment. Realignment in general improves the DeepVariant results, but can get a bit confusing when visualising the initial alignments - much as for TIDDIT v3 and onwards.
It does look very similar on the desktop version:
Let's pass this on to MIP and e.g. @ramprasadn or @jemten to see if they could perhaps find out if the region is indeed one being realigned, and ideally a post-realign image from a recent run from the region? I think that would be rather illustrative. And ofcourse just check so that it is actually a feature of DeepVariant not a bug! 😊
This is how DeepVariant see this region (https://scout.scilifelab.se/cust002/F0046529-3/23d2fc25374c8c4ee1d51012dd1ca3c7#comments) after local realignment.
It seems rather clear that neither representation is quite true to the underlying molecular one, but at least here the counts are very similar, and we can perhaps interpret this variant as being on two out of four regions pretty similar to this.
The quality value is low for a reason.
Yeah GQ=9 is a clear indicator that this is a messy region. How do you wanna proceed with this? I don't think it's very feasible to save the local realignment files for display in scout.
Agreed!
I'm personally good with this explanation. The counts match the realigned variants, not the original bwa alignment. It's not a region where I would feel very confident about short read calls with respect to which repeat copy has which variant, and the quality reflects this. What do you say @KickiLagerstedt?
I definitely think we want realignment with DeepVariant, so turning it off just to simplify the view of some actually very difficult regions is hardly on the table. Is there some kind of indicator in the VCF from DeepVariant showing that it used a realignment to make this particular call, beyond the low GQ?
I can't see any indication of that in the VCF. I don't think there is much we can do about this on the MIP side.
OK - Helena and I accept this!
Ok closing this - thank you @jemten!
Variant are called - but with low score, many reads per allele, not shown in IGV
https://scout.scilifelab.se/cust002/F0046529-3/23d2fc25374c8c4ee1d51012dd1ca3c7#comments
https://scout.scilifelab.se/cust002/F0046529-3/3746556ee54a7ab08acac68520632b3d