yahoo / streaming-benchmarks

Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...
Apache License 2.0
622 stars 294 forks source link

Seen.txt and updated.txt are empty #20

Open Nachiket90 opened 7 years ago

Nachiket90 commented 7 years ago

I am trying to setup yahoo streaming benchmark for one of my assignments. I was able to run the benchmark suite and see results on console. I was expecting results in seen.txt and updated.txt in data dir but as mentioned in the README of project. In my case those files are always empty. I might have made a mistake in setup but can you guys help to resolve it and get the results in updated.txt/seen.txt.

I even to tried to run https://github.com/dataArtisans/yahoo-streaming-benchmark but here as well files were empty after execution.

revans2 commented 7 years ago

This seems to indicate that data did not show up in redis like expected, of the tool could not find redis to get the data from. How are you trying to run the benchmark?

Nachiket90 commented 7 years ago

I have downloaded zip version of benchmark from github and copied it on CentOS server. I am trying to run benchmark as, ./stream-bench.sh SPARK_TEST

revans2 commented 7 years ago

That is odd. I'll try to reproduce it and see what I can come up with.

Nachiket90 commented 7 years ago

Could you reproduce it?

revans2 commented 7 years ago

I was able to reproduce it, but only after making a bunch of changes to the script to have it download the correct things. (Spark and Flink both removed packages) also it seems to only be happening for spark. From what I can tell spark is not writing anything into redis at all, so the files are actually accurate. I will have to do some more digging to see what might be happening.

revans2 commented 7 years ago

OK I saw the issue with flink too, but storm seems OK. This is really odd, but because we had to get newer versions of both spark and flink to get a release that is available for download there might be something there. More likely it is something with scala 2.11 which I also had to upgrade, but I will try and look at them.

Nachiket90 commented 7 years ago

Thanks for the updates. Please suggest/share if you have a solution for this issue.

Nachiket90 commented 7 years ago

Could you identify the root cause for the issue and solution?. My team is planning to run benchmarks against spark and because of that I need a solution for this issue.

revans2 commented 7 years ago

I have not been able to identify it yet, but I honestly have not tried that hard and have a lot of other priorities right now. I hopefully will find some time to dig in tomorrow.

DarkRiderW commented 4 years ago

Is there any idea? I've got the same issue

mmoanis commented 4 years ago

I have the same issue when running Flink tests only!

mmoanis commented 4 years ago

For flink looks like the issue happens because of requested operator parallelism. Setting parallelism to default (1), it works Screenshot from 2020-03-14 11-42-17