Data4Democracy / discursive

Twitter topic search and indexing with Elasticsearch
21 stars 11 forks source link

Write valid JSON to S3 #7

Closed hadoopjax closed 7 years ago

hadoopjax commented 7 years ago

To enable analysts we need to ensure we can get valid JSON written to S3 from the index_twitter_stream.py class StreamListener

The write to S3 should create a key (filename) with a timestamp as this stream runs every 15 minutes. Bonus points for creating a way to zip/concatenate files with filenames as arguments (so analysts could, for instance, take all Tweets from a given day, week, etc.).

hadoopjax commented 7 years ago

Merged w/ master and we can now write to Elasticsearch and S3!