Open Abacn opened 7 months ago
I don't think this is necessarily a bug but a limitation due the uber jar not properly registering coders. I tried a job with more logs and seems like we are simply not able to find a translator for the URN "beam:coder:length_prefix:v1" (and UnknownCoderWrapper
ends up being the fallback).
We register LengthPrefixCoder here: https://github.com/apache/beam/blob/6bca71070e96b56b781600e8833a72cea329b1a1/sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/ModelCoderRegistrar.java#L48
So seems like this registration is not being performed for the uber-jar.
I'm not getting the error when I stage the beam-sdks-java-core
jar (which performs the above registration) along with the uber jar.
--filesToStage=./build/libs/unknown_coder_error-1.0-all.jar,beam-sdks-java-core-2.56.0-SNAPSHOT.jar
I'm not sure if this is necessarily a release blocker.
Presumably this used to work and doesn't now? Or do we need better instructions on creating an uberjar that correctly preserves all the registration information?
Please create a doc about how to build an uber jar. We have been getting couple of customer issues related to this.
I see, so
Having the understanding that the underlying issue always exist, and the action item is more like a documentation request (proper way to packaging uber jar), I agree this is isn't a release blocker. Adjust the priority tag accordingly
Long term, should we try to move away from autoservice for built in components (at least if standard uberjar building tools do not do the right thing with them)? Is this more possible now with the merging of runners core? @kennknowles
Presumably this used to work and doesn't now? Or do we need better instructions on creating an uberjar that correctly preserves all the registration information?
Yeah, I'm not sure what resulted in the regression. Might be the core-construction merge (but I haven't verified). +1 for updating instructions as a workaround while we figure out the root cause.
I created https://github.com/apache/beam/pull/31042 which should give a clearer error. Perhaps that is worth cherry-picking.
What happened?
Minimum reproduce:
A minimum pipeline:
Gradle build file:
And build the uber jar with command ./gradlew :beamtest:shadowJar. Submit the job to Dataflow with
The job fails with error
The same pipeline succeeded in Beam 2.54.0, 2.54.0 under Dataflow runner v2
Issue Priority
Priority: 1 (data loss / total loss of function)
Issue Components