Open jswetzen opened 9 years ago
I haven't been able to solve this, and it's extremely hard to debug (you have to run it for up to seven hours to see if it crashed). Unfortunately, the result is that I'm giving up on Pyleus for my master thesis that I'm writing right now. I have lost too much time, and using Storm with Trident works without crashing. I hope Pyleus will continue to mature and hopefully provide a fully functioning alternative for Python enthusiasts in the future.
Sorry @jswetzen, I missed this one in the flow of my emails. :( And I'm sorry you're going to abandon Pyleus due to this bug. :(
Have you tried using the json serializer instead of the messagepack one? We wrote that ourselves, so we may have introduced exotic bugs that the Storm community didn't experience.
Thanks for your reply @poros. Changing serializer is one thing I have not tried. I can run a final test and see if that makes a difference.
It worked! It ran through all my 31 million lines without crashing!
I wonder what kind of speed loss I get since messagepack is supposed to be a lot faster. But this probably means that there's a bug somewhere in the messagepack code!
Happy to hear that the json serializer sorted things out!
A bug which is triggered after hours of running is a pain to tackle. In case you have any other/new useful info, do not hesitate to update this issue, please.
I faced the same problem, but I don't know how to set json serializer. could you give me some help? thank you !
There's a problem that I've been unable to solve for a very long time now, so I need to ask about it. I have Kafka set up with a script reading lines from a csv file into it. Then I have a Pyleus topology with the kafka spout, a line reader bolt that splits the lines by comma and prints them to a log file and finally a "black hole" endpoint bolt that just accepts all the tuples and emits nothing. It's the simplest topology I could come up with for testing.
My problem is this: it runs fine for a few hours (from two to seven) but then it suddenly crashes. I have rate limited the Kafka input so it's constant at about 700 tuples/sec and looking at the Kafka process in jconsole I can see that the input is matched by the output, byte by byte. Storm memory usage is fluctuating between 200-400 MB but never goes above that. That being said, here is the Traceback from my bolt that crashes, it's the line reader bolt right after the spout.
I'm running in local mode and Storms says there was a java.lang.RuntimeException: Acked a non-existing or already acked/failed id:
Has anyone else encountered anything similar? I'm running the latest pyleus version and Storm 0.9.2-incubating.