Clinical-Genomics / BALSAMIC

Bioinformatic Analysis pipeLine for SomAtic Mutations In Cancer
https://balsamic.readthedocs.io/
MIT License
44 stars 17 forks source link

[Assessment] Does using call-regions option Manta allow for variants with only 1 BND in target-bed? #1387

Open mathiasbio opened 6 months ago

mathiasbio commented 6 months ago

Description

Manta has an option to use --call-regions argument and supply a bedfile to restrict the calling of variants to within this region. This sounds good for our TGA cases where likely most of the off-target SVs are false positives due to ambiguous mapped reads, problematic regions, etc. However it can easily be imagined that a lot of SVs with a BND within a target of interest, may have the second BND in some intragenic region entirely outside of the panel design, and it would be nice to investigate if using Manta --call-regions still keeps variants as long as 1 BND is within the supplied bedfile.

Suggested approach

Run the same TGA case with the same settings except for --call-regions and the appropriate bedfile. Then compare the variants and see if any variant was called in the original without call-regions which had at least 1 BND within the target-region, and see if that variant wasn't called when using --call-regions.

Criteria

No response

Origin

No response

Anything else?

No response

mathiasbio commented 6 months ago

Ran one TGA case according to the suggested strategy with and without call-regions. The number of variants changed from: 15488 to 25. Which is a nice change, however when looking in IGV it does seem like when using this option that the variant is required to be entirely present in the target bed regions -- at least sometimes...

As always nothing is entirely clear cut. Below is a variant with only 1 BND in the target, no longer called. (sorry top track is with --call-regions, below is without)

image

And here is another example with a variant called with the --call-regions option and which does expand into non-targeted territory: (sorry top track is with --call-regions, below is without)

image

It could be that there is some flexibility here that it allows a variant to extend outside the target region, but only up to a certain point. Either way, I feel like using this option can have a risk of excluding variants that may be of interest, where if the goal is to reduce the numbers of variants there are other options, like adding some bcftools filters on frequency and read-depth. Which seems to work quite well!

fevac commented 6 months ago

Your conclusion sounds reasonable to me. Thanks for looking into it!