fritzsedlazeck / Sniffles

Structural variation caller using third generation sequencing
Other
542 stars 89 forks source link

Multisample Calling with BND (translocation) variants #412

Open tbenavi1 opened 1 year ago

tbenavi1 commented 1 year ago

Hello,

I am running into an issue with multisample calling for BND variants. I am merging two samples together.

Both of the following lines are in the multisample vcf, even though they correspond to the same variant:

chr1 126647803 Sniffles2.BND.632M0 N N]chr7:98078765] ...  SUPP_VEC=01
chr7 98082316 Sniffles2.BND.534M6 N N]chr1:126644102] ... SUPP_VEC=11

I would expect there to only be a single line with SUPP_VEC=11.

The first sample has the following lines:

chr7    98082316        Sniffles2.BND.3D13S6    N       N]chr1:126646467] ... GT:GQ:DR:DV     0/1:13:13:5

The second sample has the following lines:

chr1    126647802       Sniffles2.BND.944CS0    N       N]chr7:98078765] ... GT:GQ:DR:DV     0/1:4:3:9
chr7    98082316        Sniffles2.BND.6584S6    N       N]chr1:126645438] ... GT:GQ:DR:DV     0/1:47:19:10

I believe that chr1:126645438 should actually be chr1:126647802 due to an error from a hard clipped read (as in https://github.com/fritzsedlazeck/Sniffles/issues/409 which I believe might be the underlying issue in https://github.com/fritzsedlazeck/Sniffles/issues/359). I also believe that chr7:98078765 should actually be chr7:98082316 due to an error from a hard clipped read. These errors are contributing to the fact that there are multiple lines in the vcf file for the second sample, when there should only be one.

I am not sure if that is the only problem, but I also wonder whether the order of the two chromosomes in the vcf line is preventing a correct merging. For example, does Sniffles correctly merge the following:

chr1 pos1 ... ]chr2:pos2]
chr2 pos2 ... ]chr1:pos1]

Thanks for your help.

fritzsedlazeck commented 1 year ago

Hey @tbenavi1 interesting observation, we will follow that up. Fritz