zendesk / ultragrep

the grep that greps the hardest.
Apache License 2.0
29 stars 4 forks source link

Build gzipped indexes #16

Closed osheroff closed 11 years ago

osheroff commented 11 years ago

@vanchi-zendesk I'm not sure how much time you want to put into a CR of this, but here's what I'm doing:

for all logs that we index, I'm storing an index file which has a timestamp -> uncompressed-offset map for gzipped logs, I'm also keeping an index file that has uncompressed_offset, compressed_offset, and 32k of dictionary data required to seed decompression.

Note that I'm not quite complete; I need to write the ruby script that goes out and indexes logs that have yet to be indexed.

vanchi-zendesk commented 11 years ago

+1

vanchi-zendesk commented 11 years ago

+1 again