drem-darios / worldstream

This is the World Stream!
MIT License
0 stars 0 forks source link

As a WS designer, I need to have a technology stack #8

Closed drem-darios closed 9 years ago

drem-darios commented 9 years ago

Most likely we will go with Kafka for our data bus, but for data analysis and ETL, we probably want to use a different framework. I've heard Storm plays well with Kafka, but there is also Spark Streaming. Here is a comparison of the two. I will make a decision on which is better soon. http://xinhstechblog.blogspot.com/2014/06/storm-vs-spark-streaming-side-by-side.html

drem-darios commented 9 years ago

Ok I couldn't wait to read the whole thing. There were two things that stood out to me that makes me lean towards Storm(at least for the initial phase).

  1. First of all, Spark Streaming has more than a sub-second latency. It may take a few seconds to process. This may not matter after we have many users bc we may be taking in data, processing it, then playing it a few seconds later rather than the immediate instant they press "Send".
  2. The second thing is that Storm processes their data one at a time. This is how we want our data handled right now. Again, later on when there are more users and it becomes unrealistic to process data immediately(if that actually happens) Spark Streaming may be a better option because it will micro-batch the data.

In short, Storm will be our stream processing framework.

DaJebu commented 9 years ago

Awesome sounds good to me