Open jkbonfield opened 2 months ago
See also #2282 as an alternative to this PR. That is probably the way to go for compatibility, but then again I doubt anyone is currently using bcftools on RNASeq data with ref-skip based alignments given all such alignments were simply discarded, so it's unlikely the change in behaviour would break any currently working pipelines.
I believe the edited BAM_CREF_SKIP checking code https://github.com/samtools/bcftools/blob/ef8b974dee6d3bc9f56e39a75997a64a13cbcaa1/bam2bcf_indel.c#L855-L859 simply skips these reads from realignment. I am imagining a proper fix would be to split the reads into parts and deal with each separately. Unfortunately the current framework might not be bent easily for that.
If we should merge any version of this, we'd need to include a small test to demonstrate where it can be helpful.
Mpileup removes alignments using the cigar ref skip operator ("N"). This was originally added in 2011 in samtools/samtools#d1643d6 with the commit message of "fixed a bug in indel calling related to unmapped and refskip reads".
Unfortunately I don't know what that bug was, but removing the code shows it still works (at least for some data!). We need better understanding of what's going on and why it was added, so perhaps we should add a command line option to control this instead?
Fixes #2277