Tweets are huge json blobs with lots of repetition that compress quite nicely with gzip. There's already a command-line argument for enabling gzip, so compression is included as a boolean in the options dictionary.
If compression is enabled we should add ".gz" to the end of all tweet filenames, and compress them before saving the jzon blobs to disk. Similarly, when reading tweets we should expect a ".gz" at the end, and decompress them before loading the json blobs.
Tweets are huge json blobs with lots of repetition that compress quite nicely with gzip. There's already a command-line argument for enabling gzip, so compression is included as a boolean in the
options
dictionary.If compression is enabled we should add ".gz" to the end of all tweet filenames, and compress them before saving the jzon blobs to disk. Similarly, when reading tweets we should expect a ".gz" at the end, and decompress them before loading the json blobs.