What performance can be expected with Pyleus?

jswetzen commented 9 years ago

I've been using Pyleus on a server with two 2.6 GHz AMD Opteron processors and 4GB of RAM, and get about 1500 tuples/second throughput. I'm using Kafka for input and limited the input speed to that because if I go any higher the CPU gets saturated. I'm wondering if that's normal or sounds really slow. For regular Storm I understand that it would be pretty slow, but there's a lot of overhead from stdin/stdout communication with shell bolts, is that right?

Basically I'd like to know if Pyleus is quite a bit slower than Storm+Java or if I'm doing something wrong?

poros commented 9 years ago

I haven't any official data to share, since performance is really dependent on the topology and the workflow you are using to measure them, but there is no doubt that Pyleus is slow compared to pure Java Storm topologies.

However, the major pain point is not Pyleus, but the Storm multilang framework. On top of the performance loss due to Python being notably slower than Java and stdin/out communication overhead, you need to add the time spent in an additional serialization/deserialization step, so that tuples can be read by any non-JVM language (which is not possible if you leave tuple in kryo format, as they are encoded in Java Storm).

We already tried to mitigate the issue adding a msgpack serializer, which considerably boosts performance if compared to the default one based on JSON that Storm provides.

An easy tweak to improve performance is to set need_task_ids to False when emitting from bolts. This avoid you a read/write network operation for each tuple you emit (it's a tip written in the docs https://yelp.github.io/pyleus/storm/bolt.html#pyleus.storm.bolt.Bolt.emit). This only works if your topology doesn't care about tasks ids, of course.

You can also try to tweak stuff like ackers, max_spout_pending, message_timeout_secs and max_shellbot_pending in your topology definition yaml file (https://yelp.github.io/pyleus/yaml.html), to play with components parallelism or to disable guaranteed message processing mechanisms.

In addition to that, any customization like user-defined JVM options may help you as well. For any optimization specifically related to Storm, the Storm documentation and mailing list are your friends.

Having said that, I am fairly sure that, due to the presence of the msgpack parser, Pyleus is way faster than a vanilla Python topology developed using the Python API available in Storm, but it is still far from the performance you can obtain writing your topology in Java.

jswetzen commented 9 years ago

Thanks for that detailed answer! I understand that it's not a Pyleus issue but it's good to get a confirmation that Java topologies can be quite a bit faster.

I already use need_task_ids=False for all my emits and I think ackers etc. are configured well. Unfortunately I need to use the JSON serializer because I had issues with msgpack, so I guess that explains a bit of it. Thanks again!

poros commented 9 years ago

You're welcome. I'm going to close the issue, if my comment answers your question. Feel free to reopen it or to update in case you find out anything useful to share.

Yelp / pyleus

What performance can be expected with Pyleus? #127