Closed h-2 closed 2 years ago
Do you know about the BAMIntervalTree created by @joshuak94 for indexing BAM files in another way?
Do you know about the BAMIntervalTree created by @joshuak94 for indexing BAM files in another way?
Thanks for the pointer! I talked to him about. But even if we want to support that, we also need to be able to handle the regular indexes.
It doesn't yet contain all the tests I would have liked, but this is as much as I can currently do for this feature.
This PR adds Tabix supports and indexed VCF reading.
All Tabix code is currently detail and will probably stay there for now.
Some preliminary "benchmarks":
Due to architectural problems, I don't think we can ever get the IOPS to be as low as with htslib. Please see my comments in the PR. In practice, the results seem to still be OK, but this is just one example where I tried a region very far to the end of a 300MB compressed VCF. We definitely need to do more testing.
TODO