Closed Audald closed 1 year ago
Thanks @Audald -
I suppose this must be from supplementary reads with hard-clipping? If so, the bam spec doesn't provide much on how hard-clip and methylation tags should interact. By my reading, once hard-clipping is introduced there's no way the MM/ML tags could be interpretable unless they were clipped as well. Can I ask what read mapper and settings you're using for this case?
More discussion on this topic here:
https://github.com/samtools/hts-specs/issues/646
Looks like to avoid this issue you need to add the -Y
flag if you're using minimap2, or use pbmm2. I will update the README to note this.
Thanks @ctsa for looking into this. The hts-specs
discussion is very helpful and the suggested flag in minimap2
should be enough. Thanks for indicating it also in the documentation!
Sounds good. Updated readme:
https://github.com/PacificBiosciences/pb-CpG-tools#input-alignment-file
Closing this issue as explained.
CRAM/BAM files with supplementary reads cannot be processed by
aligned_bam_to_cpg_scores
(both for version2.2
and2.3
). The job is immediately stopped with the errorthread '<unnamed>' panicked at 'Read sequence is too short for MM tag in record: {qname}'
.The BED and BigWig files are successfully generated if the secondary and supplementary alignments are filtered out from the CRAM/BAM files with
-F 2308
.I am reporting this issue in case you want to consider filtering as a user option.