Open damccorm opened 2 years ago
It would be very cool to have these pipelines so we can improve their performance. Given those numbers, this is probably very low-hanging fruit.
The pipelines are attached to the jira ticket: https://issues.apache.org/jira/browse/BEAM-14438 Thanks :-)
For someone finding this issue relevant to them, adding a pipeline option --fasterCopy=true
may help. context: #13240
Thanks! I already tried fasterCopy and the performance improvement was not significant.
Have you looked at a profile to see where the overhead is?
I know that Flink has a highly tuned Kafka connector. And we do expect some overhead since Beam will add a couple layers of virtual method calls. If there are inefficiencies in serialization that will be a serious overhead.
The other thing I should say is of course in streaming we care primarily about cost and maximum scale so limiting everything to 1 task manager / slot / thread / etc is not necessarily measuring what you care about. It is still useful for finding overhead, for sure.
We had severe performance issues in our pipelines, so we decided to test and compare very simple pipelines in order to try and find the root cause of the problems. Of course, the plan to later on tune the scaling. How can I profile the pipeline and understand where the overhead is?
I tried to compare a very simple beam pipeline with an equivalent flink-native pipeline. Both pipelines read strings from one kafka topic and write them to another topic. I ran each pipeline separately on a single task manager with a single slot and parallelism 1.
Flink native runs 5 times faster than beam: 150,000 strings per second in flink comparing to 30,000 in beam.
When using Avro and schema registry the difference is even more significant - flink native runs 30-80 times faster than beam.
Attached is the java code of both string-to-string pipelines.
Imported from Jira BEAM-14438. Original Jira may contain additional context. Reported by: iafek.