pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
774 stars 274 forks source link

fix: improve parsing performance #1227

Closed valentynbez closed 11 months ago

valentynbez commented 11 months ago

An original fix was submitted by @kloetz to seqtk repo. Improves FASTA parsing to 2GB/s .

See:

jmarshall commented 11 months ago

Thanks. I was going to wait for this to come in with the next HTSlib import, but since samtools/htslib#1674 has already been merged and we use kseq within pysam, it's worth applying it now.