adamewing / bamsurgeon

tools for adding mutations to existing .bam files, used for testing mutation callers
MIT License
231 stars 86 forks source link

snv not inserted on reads with snv close-by #123

Closed tdelcourt closed 5 years ago

tdelcourt commented 5 years ago

I try to insert two snv close-by but i see that only the second one is correctly inserted. The first one is not inserted on the reads that also cover the second position. (cf image)

I found a work-around by setting the haplosize to > reads length (250), which works well if inserting at VAF 100% but would probably cause problems for lower VAFs.

Am I missing something? Is there an option that would allow this differently? Is this a bug? It may be related to https://github.com/adamewing/bamsurgeon/issues/40

missing snv

adamewing commented 5 years ago

That's exactly the situation the "haplosize" argument is designed to address. It should work fine with VAF < 1.0 but the linked SNPs will have the same VAF in the output.

The reason behind this is mutations are generated independently by default, so if two mutations share reads because they're close together one of them will be lost (or decreased on VAF) in the final merge step. Using -z makes addsnv more haplotype aware by identifying snps that are likely to share reads and modifying the reads once for all snps on a haplotype group.

tdelcourt commented 5 years ago

Yes, I assumed the final merge was the reason. If it ever was a problem, I guess successive passes of bamsurgeon for each individual mutation would be ok, if halpotypes were to be completely ignored.

Thanks for the answer,

thomas