protocolbuffers / protobuf

Protocol Buffers - Google's data interchange format
http://protobuf.dev
Other
65.44k stars 15.46k forks source link

Binary incompatibility between BigQuery client and latest protobuf-java #18636

Open wendigo opened 5 days ago

wendigo commented 5 days ago

What version of protobuf and what language are you using? Version: 4.28.2 Language: Java

What operating system (Linux, Windows, ...) and version?

MacOS/Linux

What runtime / compiler are you using (e.g., python version or gcc version)

Compiler: 3.x, runtime: 4.28.2

What did you do? Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

What did you expect to see

BigQuery client working

What did you see instead?

Caused by: java.lang.NoSuchMethodError: 'boolean com.google.protobuf.DescriptorProtos$FieldOptions.hasExtension(com.google.protobuf.GeneratedMessage$GeneratedExtension)'
    at com.google.cloud.bigquery.storage.v1.BigQuerySchemaUtil.getFieldName(BigQuerySchemaUtil.java:77)
    at com.google.cloud.bigquery.storage.v1.JsonToProtoMessage.computeDescriptorAndSchema(JsonToProtoMessage.java:470)
    at com.google.cloud.bigquery.storage.v1.JsonToProtoMessage.lambda$convertToProtoMessage$0(JsonToProtoMessage.java:323)
    at java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1229)
    at com.google.cloud.bigquery.storage.v1.JsonToProtoMessage.convertToProtoMessage(JsonToProtoMessage.java:320)
    at com.google.cloud.bigquery.storage.v1.JsonToProtoMessage.convertToProtoMessage(JsonToProtoMessage.java:271)
    at com.google.cloud.bigquery.storage.v1.JsonToProtoMessage.convertToProtoMessage(JsonToProtoMessage.java:180)
    at com.google.cloud.bigquery.storage.v1.SchemaAwareStreamWriter.buildMessage(SchemaAwareStreamWriter.java:166)
    at com.google.cloud.bigquery.storage.v1.SchemaAwareStreamWriter.appendWithUniqueId(SchemaAwareStreamWriter.java:239)
    at com.google.cloud.bigquery.storage.v1.SchemaAwareStreamWriter.append(SchemaAwareStreamWriter.java:140)
    at com.google.cloud.bigquery.storage.v1.JsonStreamWriter.append(JsonStreamWriter.java:65)
    at io.trino.plugin.bigquery.BigQueryPageSink.insertWithCommitted(BigQueryPageSink.java:110)
    at io.trino.plugin.bigquery.BigQueryPageSink.appendPage(BigQueryPageSink.java:102)
    at io.trino.operator.TableWriterOperator.addInput(TableWriterOperator.java:276)
    at io.trino.operator.Driver.processInternal(Driver.java:408)
    at io.trino.operator.Driver.lambda$process$8(Driver.java:306)
    at io.trino.operator.Driver.tryWithLock(Driver.java:709)
    at io.trino.operator.Driver.process(Driver.java:298)
    at io.trino.operator.Driver.processForDuration(Driver.java:269)
    at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:890)
    at io.trino.execution.executor.dedicated.SplitProcessor.run(SplitProcessor.java:77)
    at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.lambda$run$0(TaskEntry.java:201)
    at io.trino.$gen.Trino_testversion____20241007_092855_2.run(Unknown Source)
    at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.run(TaskEntry.java:202)
    at io.trino.execution.executor.scheduler.FairScheduler.runTask(FairScheduler.java:172)
    at io.trino.execution.executor.scheduler.FairScheduler.lambda$submit$0(FairScheduler.java:159)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:[157](https://github.com/trinodb/trino/actions/runs/11212538291/job/31163820530?pr=23697#step:14:158)5)

Anything else we should know about your project / environment

zhangskz commented 5 days ago

Looks like a compatibility issue specific to DescriptorProtos (and probably the other WKT) which are bundled with the Java runtime. BigQuerySchemaUtil (dep of trino) has code written to handle v3.25.5 gencode for DescriptorProtos.FieldOptions but is unfortunately actually getting v4.28.2 gencode from the Java runtime.

DescriptorProtos.FieldOptions gencode extends GeneratedMessage.ExtendableMessage in v4.28.2, which doesn't have shims for the deprecated hasExtension() method that were added to GeneratedMessageV3.ExtendableMessage for binary compatibility with older v3.25.5 gencode.

This isn't quite an old gencode + new runtime binary compatibility issue, but rather user code calling old gencode + new gencode issue. Currently, I think BigQuerySchemaUtil would need to be updated to remove the deprecated method call in order to upgrade. We may need to consider other ways to avoid this for WKT's shipped with the Java runtime.