pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
774 stars 274 forks source link

Seeking assistance #1244

Closed liujilei156231 closed 9 months ago

liujilei156231 commented 9 months ago

I possess numerous sorted BAM files; however, for my project, I am required to randomly select a subset of reads (1e5) from them. I have explored the option of converting a pysam object to a list, but encountered issues with substantial memory usage and slow processing. Similarly, the downsampling APIs of samtools and picard present similar challenges. Is there any efficiency may?

jmarshall commented 9 months ago

This is not a pysam question.

I suggest you post your question at https://biostars.org/ or https://bioinformatics.stackexchange.com/.