samtools / htslib

C library for high-throughput sequencing data formats
Other
813 stars 446 forks source link

bgzf index - last offset? #262

Open jvolkening opened 9 years ago

jvolkening commented 9 years ago

I noticed that the index written during compression by bgzip (-i) is missing the final block offset, while if re-written (-r) it is present. This is specifically noted in the source for bgzf.c as NOT a bug but not further explained. Can anyone fill me in on the reason for this? I'm polishing up an implementation of BGZF I/O in Perl and would like to understand this feature? quirk? for index I/O compatability.

jrandall commented 9 years ago

@pd3 can you explain why this is not a bug?

pd3 commented 9 years ago

bgzf_flush is called in bgzf_close when writing, therefore the last block is not known before bgzf_close is called. I think it does not matter in practice.

jvolkening commented 9 years ago

Thanks - if it is an implementation detail that's fine. My own code requires a complete index but I can fill in the last offset as needed.