Beam slowness compared to flink-native

apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.

https://beam.apache.org/

Apache License 2.0

7.81k stars 4.23k forks source link

Beam slowness compared to flink-native #21569

Open damccorm opened 2 years ago

damccorm commented 2 years ago

I tried to compare a very simple beam pipeline with an equivalent flink-native pipeline. Both pipelines read strings from one kafka topic and write them to another topic. I ran each pipeline separately on a single task manager with a single slot and parallelism 1.

Flink native runs 5 times faster than beam: 150,000 strings per second in flink comparing to 30,000 in beam.

When using Avro and schema registry the difference is even more significant - flink native runs 30-80 times faster than beam.

Attached is the java code of both string-to-string pipelines.

Imported from Jira BEAM-14438. Original Jira may contain additional context. Reported by: iafek.

kennknowles commented 2 years ago

It would be very cool to have these pipelines so we can improve their performance. Given those numbers, this is probably very low-hanging fruit.

ifat-afek commented 2 years ago

The pipelines are attached to the jira ticket: https://issues.apache.org/jira/browse/BEAM-14438 Thanks :-)

Abacn commented 1 year ago

For someone finding this issue relevant to them, adding a pipeline option --fasterCopy=true may help. context: #13240

ifat-afek commented 1 year ago

Thanks! I already tried fasterCopy and the performance improvement was not significant.

kennknowles commented 1 year ago

Have you looked at a profile to see where the overhead is?

kennknowles commented 1 year ago

I know that Flink has a highly tuned Kafka connector. And we do expect some overhead since Beam will add a couple layers of virtual method calls. If there are inefficiencies in serialization that will be a serious overhead.

The other thing I should say is of course in streaming we care primarily about cost and maximum scale so limiting everything to 1 task manager / slot / thread / etc is not necessarily measuring what you care about. It is still useful for finding overhead, for sure.

ifat-afek commented 1 year ago

We had severe performance issues in our pipelines, so we decided to test and compare very simple pipelines in order to try and find the root cause of the problems. Of course, the plan to later on tune the scaling. How can I profile the pipeline and understand where the overhead is?