simon-anders / htseq

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.
https://htseq.readthedocs.io/en/release_0.11.1/
GNU General Public License v3.0
122 stars 77 forks source link

add `check_sq` option `BAM_Reader` #53

Closed jbloom closed 6 years ago

jbloom commented 6 years ago

BAM files created by some programs (e.g., PacBio's CCS) lack the SQ flag. They can still be read by pysam using the check_sq=False flag. This commit creates an option to use that flag when using HTSeq.BAM_Reader.

iosonofabio commented 6 years ago

Hi Jesse,

Thanks for the PR, can you please fix it for python2 as well? Then I'll try and make a test for this if I can find a short pacbio bamfile, or if you could send me one it'd be great ;-)

Fabio

On May 6, 2018 9:31:57 AM PDT, Jesse Bloom notifications@github.com wrote:

BAM files created by some programs (e.g., PacBio's CCS) lack the SQ flag. They can still be read by pysam using the check_sq=False flag. This commit creates an option to use that flag when using HTSeq.BAM_Reader. You can view, comment on, or merge this pull request online at:

https://github.com/simon-anders/htseq/pull/53

-- Commit Summary --

  • add check_sq option BAM_Reader

-- File Changes --

M python3/HTSeq/init.py (11)

-- Patch Links --

https://github.com/simon-anders/htseq/pull/53.patch https://github.com/simon-anders/htseq/pull/53.diff

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/simon-anders/htseq/pull/53

jbloom commented 6 years ago

Hi Fabio,

The last commit e9efae3 should have added it for python2 as well.

I have attached a short example CCS PacBio file that can be read by BAM_Reader with check_sq=False but not with default check_sq=True. (I had to ZIP the file to get GitHub to allow me to upload it, so that is why it is a zipped BAM).

Thanks!

--Jesse short_test_ccs.bam.zip