Open aroon-color opened 1 day ago
Hi @aroon-color,
Thanks for sending over the example, it makes it much easier to investigate. I agree it looks like the support is very far off. I will need to run some tests to get to the bottom of this, but my guess is that it has something to do with the very high level of soft-clipping in the reads. Perhaps there are primers in the reads that need to be trimmed? It looks like almost all the reads in the area are soft-clipped. There are also lots of inversion-pattern reads (green colour below) - is this to be expected?
We do expect a slightly higher than usual number of reads to look like they have inversion signals but they're really just R1/R2 being assigned incorrectly during demux. The softclips are generally just PolyA-like sequences that result from the enrichment library prep, but I can also take a closer look at just these reads
I see. I will have a look in to as soon as possible.
Hi @kcleal,
I'm seeing some weird behavior from Dysgu around read counting and was curious if you could help. I'll preface this by saying the sequencing data I'm working with is not WGS/WES but instead target enrichment data, which might be a confounding variable. Not entirely sure how Dysgu's modeling deals with more peaky data.
Attached is a BAM subset to the region I'm curious about in MUTYH and the corresponding Dysgu VCF. Dysgu command:
.venv/bin/dysgu run b37/GRCh37.p12.fa temp sample_roi.bam --max-cov -1 -p 6 --min-support 3 --svs-out sample_roi.vcf -x --min-size 50 --metrics
. Note that this BAM was created by taking the alignment file fromdysgu fetch
and subsetting it to1:45790000-45800000
.sample_roi_bam_vcf.zip
I'm seeing Dysgu call a 67bp deletion starting at
chr1:45797418
(\t -> \n
to make reading easier):The part that's confusing to me is the read metrics Dysgu says support the call, they are:
SU=1384;WR=1;PE=838;SR=0;SC=1199;BND=544
. But if I manually inspect that region in IGV, I see no evidence of the hundreds of SU, PE, SC, or BND alignments that would support this call? Average coverage across this region is ~110x. I see a similar pattern with short DEL (\~<80bp) and INV (\~<400bp) across several of the samples I've tested so far, where Dysgu seems to be producing a very confident call given the read support indicated but then inspection of the alignment doesn't back up these calls.Happy to provide more details if that would help!
@kkchau