fritzsedlazeck / Sniffles

Structural variation caller using third generation sequencing
Other
543 stars 91 forks source link

Large deletion filtered out #479

Open sumudu-rangika opened 4 months ago

sumudu-rangika commented 4 months ago

Hi,

Thank you for a great tool!

My question is, in one of my samples there are two deletions near each other. One is ~2847bp and other is ~12150bp (one copy of CYP2D6 gene deletion). These two deletions are clearly visualized in IGV (attached). The small ~2847 deletion is called as a filter PASS SV with GT=1 | 0, GQ=47, DR=19 and DV=10. The large deletion which is close by has been called as a PRECISE SV but has been filtered with GT. This variant has a GT=0 | 0, GQ=26, DV=4 while the DR=24. But in the IGV image there are 11 reads that supports this large deletion. Why is sniffles calculate it as 4? I believe this is the reason for it to get filtered with GT as the GQ is high (screen shot attached). The small deletion and large deletions are on the same reads. But small is PASS and large is filtered.

Also, in another sample there's a filter PASS variant called as ./. genotype.

This is ONT WGS data. Please help me to understand this.

Thank you Best Sumudu

deletion.pdf Screenshot from 2024-05-10 16-21-34

lfpaulin commented 3 months ago

Hello Sumudu It looks like Sniffles is only using the reads marked in red for that call. Can you confirm it with the read names you have in the INFO field under RNAMES It seems that the other reads were not used, likely based on position. We would like to investigate further that issue. Could you please run the sample again and adding the --no-qc flag and sharing the output SVs that are in that region. It could be only columns 1,2,3,7,8 and 10 from the VCF file image

Best Luis

sumudu-rangika commented 3 months ago

Hi Luis,

Many Thanks for your feedback.

As you said the RNAMES are the ones you marked in red. And I tried with the --no-qc option with sniffles 2.3.3 latest version. It detects the large (~12, 150bp) deletion. But VCF has more than one entry for the same deletion with different RNAMES. I attached here the extracted region of interest in a .txt file for your reference. In this case, what would be the best way to run sniffles? Greatly appreciate your help to understand this.

I also have hybrid SVs in CYP2D6 gene that represent as insertion which I'm trying to figure out. I'll post it as a separate issue.

Thank you Best Sumudu test.txt