apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.78k stars 4.21k forks source link

Compatibility Issues Between Apache Beam and Flink #29660

Open vcastell-tibco opened 9 months ago

vcastell-tibco commented 9 months ago

I am running into some issues and would like comments on how to proceed or resolve the issue.

I have been successful in running the Flink examples against a Flink Cluster running on Kubernetes. I am trying to run similar examples from Apache Beam. I am able to the following using Apache Beam 2.5.0 or 2.5.2 against Flink 1.16.3 but I am trying to use Flink 1.17.1 or later but I cannot submit job against Flink cluster. Package Apache Beam Wordcount example with Flink runner using command “$ mvn package -Pflink-runner”? Push the application to Flink Cluster using below convention which is documented in Beam docs.

mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ -Pflink-runner \ -Dexec.args=“--runner=FlinkRunner \ --inputFile=/path/to/pom.xml \ --output=/path/to/counts \ --flinkMaster= \ --filesToStage=target/word-count-beam-bundled-0.1.jar”

If I do above steps using Beam 2.50.0 or 2.52.0 against Flink 1.16.3 then it works fine. Job is submitted and runs successfully. If I use the same version of Beam but a higher version of Flink then the job fails to submit. I am getting errors with the dependencies which are packaged with application. It looks like the additional jars packaged are not compatible with the Flink runtime. Can someone please let me know what combination of Beam is required with Flink 1.17.x or higher? Thanks

ConfuzedCoder commented 8 months ago

I think the Flink Runner only supports a Flink version till 1.16, there is no support for versions 1.17 and 1.18. Someone correct me if I am wrong.