Analyze Yahoo streaming benchmarks (from their blog probably)
Goal: Measure the latency of the frameworks under low throughput scenarios
Solution:
Flink and Storm are good (law latency) and good throughput
Spark high latency(bad) and acceptable throughput
Flink did not use checkpoint to guarantee processing
Goal: measure the maximum throughput of each system while maintaining the best possible fault tolerance.
Goal 2: Do some optimizations and variants such as not use key store REDIS
What are the differences?
With the new approach
Flink is more efficient because can manage more throughput
Fault tolerant and consistency in Flink and Storm
What are the bottlenecks in yahoo streaming benchmarks?
Redis
Storing Key Value Store, while updating the windows very quickly , crash in 280,000 events/sec
Depending on bottlenecks, state the clear definition of the problem.
Delete Key Value Store, beacuase the is part of the fault tolerant local state (with the checkpoints??),
With thise approach pass from 280,000 events/sec to 15,000,000 events/sec
Goal: Measure the latency of the frameworks under low throughput scenarios Solution: Flink and Storm are good (law latency) and good throughput Spark high latency(bad) and acceptable throughput Flink did not use checkpoint to guarantee processing
Goal: measure the maximum throughput of each system while maintaining the best possible fault tolerance. Goal 2: Do some optimizations and variants such as not use key store REDIS
With the new approach
Flink is more efficient because can manage more throughput
Fault tolerant and consistency in Flink and Storm
Redis
Storing Key Value Store, while updating the windows very quickly , crash in 280,000 events/sec
Delete Key Value Store, beacuase the is part of the fault tolerant local state (with the checkpoints??), With thise approach pass from 280,000 events/sec to 15,000,000 events/sec