fritzsedlazeck / Sniffles

Structural variation caller using third generation sequencing
Other
561 stars 95 forks source link

Dramatic change in GTs with smaller size DEL SV #530

Open tuannguyen8390 opened 3 days ago

tuannguyen8390 commented 3 days ago

Hi @fritzsedlazeck @lfpaulin @hermannromanek

We've been comparing results from new v.2.5.2 with older genotypes (v2.2 snf & then v 2.3.3 for joint call/merge with pct-seq=0) on a set of 108 ONT cattle sequences (~10x – 25X).

As expected, with v2.5.2 we found fewer large DEL with 1/1 genotypes & for several known recessive lethals genotypes were corrected to 0/1 as reported on a different issue thread.

However, with v2.5.2 we also found that 1/1 genotypes for smaller DEL (<5 kb) were also substantially changed from 1/1 to 0/1. For example, with DEL < 5kb we record that 15% of the GTs changed from 2.3.3 -> 2.5.2, with DEL 5-10Kb it's 19%. The number was <1% for INS.

We also test bam visualisation for one SV, the 1st and 5 individual was

Now called as 0/1:60:14:12:Sniffles2.DEL.20E2SA in 2.5.2 (was 1/1 in 2.3.3) Now called as 0/1:60:23:23:Sniffles2.DEL.1969SA in 2.5.2 (was 1/1 in 2.3.3)

image

At the moment we haven't looked at more than a few examples so far, but will do soon tomorrow

Hardy-Weinberg Equilibrium (HWE) analysis indicates that version 2.5.2 also produced a large increase in proportion of smaller DEL deviating from HWE, with an excess of heterozygous calls compared to 2.3.3.

image

Cheers,

Tuan

tuannguyen8390 commented 3 days ago

Oh forgot to mention but we did compared the settings between 2.3.3 & 2.5.2 and notice that 2 option were disabled as default --mapq 20 --min-alignment-length 1000, we tried to add back these two and run the above test.