cortes-ciriano-lab / savana

Somatic structural variant caller for long-read data
Apache License 2.0
41 stars 2 forks source link

Undetected reads in normal tissue #38

Open apsteinberg opened 5 months ago

apsteinberg commented 5 months ago

Hi there,

Thank you for developing this wonderful tool. I had a question regarding the detection of reads in normal tissue. We have found that for several of our SVs, that savana detects reads in the tumor but not the normal. Below is an example of a deletion. The tumor bam is shown in purple and normal in green. The deletion region is highlighted on the top in red. I've highlighted split reads which span the deletion in the tumor and normal with different colors.

SV_ID_29_for_savana_issue

In this case we have set the mapq flag to be --mapq 0, so the mapq = 0 reads in the normal should be included. Yet, when we look at the .bedpe file, we see that savana finds no reads in the normal tissue at this breakpoint. We are running savana 0.2.4 here. Any suggestions for how to adjust on our end so these normal reads are detected by savana?

Thanks for your time and help.

Best, Asher

helrick commented 5 months ago

Hi Asher,

Thanks for raising this issue - I agree that it definitely looks like those normal reads should be included as evidence. Are you able to share the VCF lines for the tumour variant that is called by SAVANA? Additionally, do the two normal reads appear at all in the sv_breakpoints_read_support file? If so, would you be able to send along the VCF lines of any variants associated with those reads?

Finally, would you be able to confirm that those two normal reads are primary alignments and not secondary or supplementary?

All the best, Hillary

apsteinberg commented 5 months ago

Hi Hillary,

Thanks for the quick response and apologies for the delayed reply! Below are the VCF lines corresponding to the variant:

chr10   133786982   ID_70428_1  T   T[chr10:133787062[  .   PASS    SVTYPE=BND;MATEID=ID_70428_2;NORMAL_SUPPORT=0;TUMOUR_SUPPORT=6;SVLEN=80;BP_NOTATION=+-;ORIGINATING_CLUSTER=5e4cf8891b8f47a2b86393978d175684;END_CLUSTER=6f3be508f7a6484cbe8915303cb8e1e7;ORIGIN_STARTS_STD_DEV=37.27;ORIGIN_STARTS_MEDIAN=133786982.0;ORIGIN_EVENT_SIZE_STD_DEV=16.54;ORIGIN_EVENT_SIZE_MEDIAN=80.0;ORIGIN_EVENT_SIZE_MEAN=71.83;ORIGIN_UNCERTAINTY=671.1;ORIGIN_EVENT_HEURISTIC=0.21;END_STARTS_STD_DEV=20.85;END_STARTS_MEDIAN=133787062.0;END_EVENT_SIZE_STD_DEV=16.54;END_EVENT_SIZE_MEDIAN=80.0;END_EVENT_SIZE_MEAN=71.83;END_UNCERTAINTY=383.12;END_EVENT_HEURISTIC=0.21  GT  0/1
chr10   133787062   ID_70428_2  G   ]chr10:133786982]G  .   PASS    SVTYPE=BND;MATEID=ID_70428_1;NORMAL_SUPPORT=0;TUMOUR_SUPPORT=6;SVLEN=80;BP_NOTATION=+-;ORIGINATING_CLUSTER=5e4cf8891b8f47a2b86393978d175684;END_CLUSTER=6f3be508f7a6484cbe8915303cb8e1e7;ORIGIN_STARTS_STD_DEV=37.27;ORIGIN_STARTS_MEDIAN=133786982.0;ORIGIN_EVENT_SIZE_STD_DEV=16.54;ORIGIN_EVENT_SIZE_MEDIAN=80.0;ORIGIN_EVENT_SIZE_MEAN=71.83;ORIGIN_UNCERTAINTY=671.1;ORIGIN_EVENT_HEURISTIC=0.21;END_STARTS_STD_DEV=20.85;END_STARTS_MEDIAN=133787062.0;END_EVENT_SIZE_STD_DEV=16.54;END_EVENT_SIZE_MEDIAN=80.0;END_EVENT_SIZE_MEAN=71.83;END_UNCERTAINTY=383.12;END_EVENT_HEURISTIC=0.21  GT  0/1

However, I took a look again in IGV and it actually appears these two normal reads are secondary alignments, and I can confirm that they do not show up in the sv_breakpoints_read_support file.

Are all secondary and supplementary alignments not considered by savana? This would be good to know going forward. And I guess perhaps this is one of the issues with considering these mapq = 0 reads as we had done with our analysis?

Thanks again for your time and help.

Best wishes, Asher

helrick commented 4 months ago

Hi Asher,

While supplementary reads are considered by SAVANA as evidence, at this time we don't consider secondary alignments since they are of lower quality than primary and we have found they add mostly noise. Thank you for the example above as I can see in that particular case it might be of use to consider evidence from secondary alignments. I will look into adding an option to consider secondary reads in future versions of SAVANA and keep this issue open in case others also would like to have this feature.

Many thanks, Hillary