Clinical-Genomics / BALSAMIC

Bioinformatic Analysis pipeLine for SomAtic Mutations In Cancer
https://balsamic.readthedocs.io/
MIT License
45 stars 16 forks source link

[User Story] Variants with VAF of 1 should not be automatically filtered out #1345

Open mathiasbio opened 11 months ago

mathiasbio commented 11 months ago

Need

As a clinician I want to be able to detect all true somatic variants, but currently in WGS-tumor-only cases we are filtering out all somatic variants with a VAF of 1 with a bcftools filter which aims to remove varians without a balanced strand-representation, which requires that at least a couple of reads support the reference allele.

As has been observed previously, some somatic variants can have an unusually high presence in the extracted DNA of the tumor. Such as that described in this issue: https://github.com/Clinical-Genomics/BALSAMIC/issues/1166 where some variants were filtered out in an Interlaboratory Comparison.

Beyond that ILC there are no concrete examples that I am aware of, where a clinically relevant variant with such a high VAF has been filtered out, as it should be a fairly rare event. But during the work of the GMS-BT harmonisation project one clinically relevant variant almost reached this level with a VAF of 0.996. So it seems that we're at risk of sometimes filtering out very important variants with this filter.

Suggested approach

Some work has already been done to investigate the effects of a related filter: https://github.com/Clinical-Genomics/BALSAMIC/issues/1180

This max allele frequency 1 filter has already been removed in release 13 of balsamic, but this strand bias filter is still removing all variants with a tumor AF of 1.

Perhaps this filter can be adjusted to only be a strand bias for the alt-allele. Before implementing this change it could be worth it to try to evaluate the effect of detecting false positives, and of including extra germline variants, as the requirement of including some reference allele variants has a side-effect of filtering out homozygous germline variants in the tumor only WGS cases.

Considered alternatives

No response

Deviation

No response

System requirements assessed

Requirements affected by this story

No response

Risk assessment needed

Risk assessment

No response

SOUPs

No response

Can be closed when

No response

Blockers

No response

Anything else?

No response

mathiasbio commented 10 months ago

This has been partially solved in: https://github.com/Clinical-Genomics/BALSAMIC/pull/1338 Merged in Release 13: https://github.com/Clinical-Genomics/BALSAMIC/pull/1320

It has been solved in every way except for T-only WGS cases wherein another filter still requires that some variants support the reference allele.