apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.81k stars 4.23k forks source link

[Bug]: Dependency Issue with Apache Beam 2.51.0 Upgrade #29345

Open anilmor23 opened 11 months ago

anilmor23 commented 11 months ago

What happened?

After upgrading Apache Beam from version 2.44.0 to 2.51.0, we are encountering a dependency issue related to the Google Cloud BigQuery library. Specifically, we are facing the following error:

org.apache.beam.sdk.util.UserCodeException: java.lang.NoClassDefFoundError: Could not initialize class com.google.cloud.bigquery.storage.v1.stub.GrpcBigQueryWriteStub

We attempted to resolve this issue by trying different versions of the google-cloud-bigquery dependency, such as 2.29.0, but none of them worked. Even after downgrading to Apache Beam 2.50.0, the issue persists. The only resolution is to revert to Apache Beam 2.44.0, which functions correctly. But Dataflow 2.44.0 has a dataloss issue.

This bug report aims to address the compatibility issue between Apache Beam 2.51.0 and the Google Cloud BigQuery library, seeking a solution or workaround for this problem to enable successful use of the latest Apache Beam version.

Issue Priority

Priority: 0 (outage / urgent vulnerability)

Issue Components

Abacn commented 11 months ago

Hi, are you using a custom container? Also, if you are using maven or gradle, could use standard maven/gradle tasks to check the dependency tree, to see why old version of bigquery-storage client was picked up.

This is almost likely a configuration issue, set to P2.