apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.9k stars 4.27k forks source link

[Bug]: Unable to use KafkaIO with Flink Runner #26217

Open aditiwari01 opened 1 year ago

aditiwari01 commented 1 year ago

What happened?

I am trying to use KafkaIO read with Flink Runner for Beam version 2.45.0 I am seeing the following issues with the same:

org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: No translator known for org.apache.beam.runners.core.construction.SplittableParDo$PrimitiveUnboundedRead
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
    at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
    at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:841)
    at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:240)
    at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1085)
    at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1163)
    at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
    at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1163)
Caused by: java.lang.IllegalStateException: No translator known for org.apache.beam.runners.core.construction.SplittableParDo$PrimitiveUnboundedRead
    at org.apache.beam.runners.core.construction.PTransformTranslation.urnForTransform(PTransformTranslation.java:283)
    at org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.visitPrimitiveTransform(FlinkStreamingPipelineTranslator.java:135)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:593)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:585)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:585)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:585)
    at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$500(TransformHierarchy.java:240)
    at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:214)
    at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:469)
    at org.apache.beam.runners.flink.FlinkPipelineTranslator.translate(FlinkPipelineTranslator.java:38)
    at org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.translate(FlinkStreamingPipelineTranslator.java:92)
    at org.apache.beam.runners.flink.FlinkPipelineExecutionEnvironment.translate(FlinkPipelineExecutionEnvironment.java:115)
    at org.apache.beam.runners.flink.FlinkRunner.run(FlinkRunner.java:104)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:323)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:309)
    at BeamPipelineKafka.main(BeamPipelineKafka.java:51)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
    ... 8 more

Is there anything I am missing? Or maybe some config to not use this SpilttableParDo?

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

aditiwari01 commented 1 year ago

Have also tried with experimental flag "--experiments=use_deprecated_read"

Still facing same error.

zjffdu commented 1 year ago

@aditiwari01 Could you share your sample code? I can use KafkaIO in flink runner.

prakhar-clarity commented 3 months ago

did it get solved? i am getting the same issue using pubsubIO I am using beam version 2.57.0