PacificBiosciences / pb-CpG-tools

Collection of tools for the analysis of CpG data
BSD 3-Clause Clear License
74 stars 6 forks source link

Error with haplotagged BAM files - Unexpected HP tag format #47

Closed Audald closed 1 year ago

Audald commented 1 year ago

I am facing a problem when running aligned_bam_to_cpg_scores for haplotagged BAM files. This is the command used:

/path/pb-CpG-tools-v2.3.0-x86_64-unknown-linux-gnu/bin/aligned_bam_to_cpg_scores --bam /path/sample.haplotagged.bam --output-prefix /path/phased/BEDs/sample.pbmm2 --model /path/pb-CpG-tools-v2.3.0-x86_64-unknown-linux-gnu/models/pileup_calling_model.v1.tflite --threads=8

And the error is as follows:

thread '<unnamed>' panicked at 'Unexpected HP tag format in read ABCD I32(2)', src/meth_read_processor.rs:1128:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The HP tag in read ABCD is HP:i:2, which seems aligned to what is supposed (see #43). We then realised that some of the reads are not HP-tagged (ambiguous reads?) and thought that could be problematic. However, the problem is still there when the untagged reads are removed from the BAM file.

The phasing was performed with HiPhase.

ctsa commented 1 year ago

Thanks @Audald - there is an issue here handling the binary representation of the integer in the HP field -- it looks like whatshap always writes this as a uint8, and here it's an int32. I'll fix the method to handle all int representations.

ctsa commented 1 year ago

Here's the update with fixed HP parser, hopefully this should take care of the issue:

https://github.com/PacificBiosciences/pb-CpG-tools/releases/tag/v2.3.1

Audald commented 1 year ago

Hi @ctsa, thanks for the prompt fix. It does work!

ctsa commented 1 year ago

Good to hear. Closing as resolved.