EnCiv / undebate

Not debates, but recorded online video Q&A with candidates so voters can quickly can get to know them, for every candidate, for every election, across the US.
Other
20 stars 14 forks source link

Build data pipeline with ELK stack for visualizing logs #253

Open djbowers opened 4 years ago

djbowers commented 4 years ago

Undebate ELK Data Pipeline

We want to be able to visualize our log data so that we can make better decisions about where to focus our time and effort to most effectively improve this application.

So far @epg323 has made some progress towards this goal with a Node app that reads the logs data from MongoDB directly and outputs some results to the command line, which led to discovering some changes that needed to be made in the application itself.

This ticket is an attempt to create a deployable pipeline using the open-source ELK stack (Elasticsearch, Logstash, and Kibana) that will accomplish the same goal. We will use Logstash to pipe the data from Mongo to Elasticsearch, then visualize the data in Elasticsearch using Kibana.

Our hope is to be able to deploy this pipeline with Docker on AWS. When we tear down the pipeline, we will store the historical log data in S3. Every time we deploy the pipeline, we will first read in historical data from S3, then read in the new data from Mongo. This way we have cheap storage for our historical data and don't have to worry about our short log history on Mongo.

Tasks Remaining

ddfridley commented 4 years ago

Goal for next week it to be able to read in the documents from Mongo and make some visualization.

djbowers commented 4 years ago

In case Logstash doesn't work out for moving the logs from MongoDB to Elasticsearch, I found this stack overflow question on doing it with Python: https://stackoverflow.com/questions/44155858/load-data-from-mongodb-to-elasticsearch-through-python

ddfridley commented 4 years ago

goal for next week is to be able to read in all of the key words.