pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
773 stars 274 forks source link

No option to flush VariantFile.write() (or at least undocumented) #1299

Open blex-max opened 1 month ago

blex-max commented 1 month ago

Output is not written until the program completes. Is this something that could be added? Many thanks for your work.

jmarshall commented 1 month ago

Much as with ordinary Python I/O streams, you should call myvariantfile.close() when you have finished writing to it to ensure the output is flushed and the file closed in a timely fashion.

Alternatively VariantFile, AlignmentFile, etc can be used as context managers, so that the following would ensure the file is closed at the end of the with statement:

with pysam.VariantFile('foo.vcf', 'w') as f:
    f.write(record)
    …etc…

Pysam could do more for flushing these files while writing them: there is an underlying HTSlib hts_flush() function that VariantFile, AlignmentFile, etc could expose as a flush() method. However I suspect most code would not really benefit from being able to do this.

blex-max commented 1 month ago

In my case I'd like to be able to watch 'tail filename' output as it is produced, I find it useful for debugging