samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

VCFFileReader that can read from InputStream #1243

Closed romanzenka closed 5 years ago

romanzenka commented 5 years ago

I see that a related issue #63 got closed - we would like to parse a VCF file top to bottom, and we are getting it as a stream. The interfaces only take files. Would it be possible to make an interface that allows VCFFileReader to be opened on a stream?

I am willing to make a pull request, but I would like your advice on how to go about this the best. My thought was to enrich AbstractFeatureReader to take a stream, somehow wrap it into the expected SeekableByteChannel (except seeking is disallowed), then cross fingers and pray that the way we access the VCF, a seek will never be necessary (and if it is, we cache and buffer so we can pretend to do seeking).

Knowing your codebase, could this possibly work in a situation when you are simply parsing entire VCF top to bottom?

lindenb commented 5 years ago

see also: https://github.com/samtools/htsjdk/pull/837

romanzenka commented 5 years ago

I am closing this issue as it is a duplicate of #837 which would satisfy my needs here.