samtools / htslib

C library for high-throughput sequencing data formats
Other
799 stars 445 forks source link

HTS_IDX_REST does not behave as expected #188

Open ekg opened 9 years ago

ekg commented 9 years ago

Open a VCF file. Jump to a position:

iter = tbx_itr_querys(tbx, chroms.front().c_str());

Now try to change the iterator so that it just "goes from the current position" on:

tbx_itr_destroy(iter);
iter = tbx_itr_queryi(tbx, HTS_IDX_REST, 0, 0);

I'd like to be able to jump to a target region which is == to the entire file.

This used to be possible with tabix via a call similar to tbx_itr_queryi(tbx, 0, 0, 0) What's changed? The workaround on my end is to put my VCF reader into two modes. In one, it steps through all chromosomes. In the other, just targets.

ekg commented 9 years ago

To further clarify, it seems that HTS_IDX_REST is having the same behavior as jumping to the very first line of the file. Then the parser reads back through the header and spits out a number of errors. There doesn't seem to be a way to seek to the first non-header entry without encountering these errors.

It's not really a problem for me, because tabix makes it easy to get the list of chromosomes in the file, but it threw me for a long loop. In effect the API has changed from the standalone tabix/bgzip.