pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
774 stars 274 forks source link

reconsituted AlignedSegment is not equivalent to original #1240

Open tzuni opened 10 months ago

tzuni commented 10 months ago

If I read an alignment from a bam file and use the to_dict() or to_string() methods to export the alignment, the resulting AlignedSegment object created by from_dict() or fromstring() are not equivalent to the original.

example with to_string() but same happens with dictionary version

foo_str = orig_aln.to_string() foo_header = orig_aln.header foo = pysam.AlignedSegment.fromstring(foo_str, foo_header)

this is False

foo == orig_aln

this is True

foo.to_string() == orig_aln.to_string()

I can't find what is different about them however.

jmarshall commented 10 months ago

Exercising this on a small amount of test data produced equivalent results for me.

You'll need to add some test data that demonstrates the problem for us to investigate further.