protocolbuffers / protobuf

Protocol Buffers - Google's data interchange format
http://protobuf.dev
Other
65.37k stars 15.46k forks source link

Offer a better migration path from protobuf-java 3.x to 4.x #16452

Closed ummels closed 1 month ago

ummels commented 5 months ago

What language does this apply to?

Java

Describe the problem you are trying to solve.

After updating our project protobuf-java to 4.26.0 within my project, we noticed a serious runtime error in production: Code from an external dependency we're using is throwing a ClassNotFoundException while accessing the newBuilder() method of a generated class:

java.lang.NoClassDefFoundError: com/google/protobuf/GeneratedMessageV3$ExtendableMessageOrBuilder
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
    at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
    at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
    at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
    at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
    at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
    at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
    at no.ecc.vectortile.VectorTileEncoder.encode(VectorTileEncoder.java:359)

This is a serious problem since there are many Java libraries floating around, which internally use Protobuf.

Describe the solution you'd like

In my opinion, the easiest way to fix the incompatibility would be to reintroduce GeneratedMessageV3 as a (deprecated) alias for GeneratedMessage, but maybe there are better alternatives?

ummels commented 5 months ago

See also https://github.com/grpc/grpc-java/issues/11015.

googleberg commented 5 months ago

@ummels thank you for the feedback. We are evaluating the option to reintroduce GeneratedMessageV3 as a deprecated shim to smooth the transition.

Just want to add that in line with our version policy we will be supporting the protobuf 3.x line for a year to allow projects to update/adapt.

googleberg commented 5 months ago

The protobuf team has been looking at better migration paths for this first major version bump since the introduction of proto3 and future major version bumps. We'll publish the plan soon at https://protobuf.dev/support/version-support/

The exact date and version numbers are still TBD, but this quarter (between now and 7/1/24) we will make a a set of releases that allow gencode from a new patch release of the 3.25.- line to work with runtimes from the 4.-.- release (and all v4 releases following).

Because we're getting to this a bit slowly, we will also be extending support for 3.25.- to Q1 of 2026 to allow plenty of time for users to migrate.

Thank you for your patience.

eed3si9n commented 4 months ago

Java will target making major version bumps annually in Q1 of each year.

IMO 1 year is too short, and it will reduce the usefulness of Protocol Buffer.

protobuf-java 3.x maintained its binary compatibility for 8 years from July, 2016 to March, 2024, and this stability has allowed the evolution of Protocol Buffer from a schema for internal data serialization to a fabric of JVM ecosystem where the proto-generated Java files are published as libraries. This includes Apache Hadoop, Apache Kafka, gRPC, and by extension Google Cloud APIs, etc. And if I understand Version Support correctly, it's saying protobuf-java will break binary compatibility every year in Q1, splitting the JVM ecosystem each year. Would that mean that all downstream libraries would also make breaking changes every year in Q1, or would it looks similar to Java version or Ubuntu version and people would try to stick with some LTS version? Or downstream would pick different version at different timing?

I think practically, the option protobuf-java users needs to pick is to stay on 3.x as long as possible, and shade protobuf-java 4.x/5.x/6.x at the point of publishing a library, and/or shade protobuf-java 4.x/5.x/6.x downstreams like gRPC-based libraries (should gRPC bump along).

googleberg commented 4 months ago

While it may appear that protobuf-java 3.x maintained its binary compatibility for 8 years from July, 2016 to March, 2024; we don't actually guarantee that. Cross-release compatibility was not really tested in the past and there was a tendency to "sneak in" breaking changes "some time" after stopping use of an internal API.

For the past 2 years we have been gradually making our release and support policy more concrete and predictable. The effect of the updated rolling compatibility policy is that gencode from v4.x will be compatible with v4.x+ and v5. runtimes; v5.y gencode will be compatible with v5.y+ and v6. runtimes; etc. That means that gencode should be regenerated every 2 years or so. In order to get security and bug fixes, gencode really does have to be updated in addition to updating the runtime.

For v3 gencode on v4 runtime compatibility we are trying to establish a minimum 3.x gencode that will work with the v4. runtimes and we're fairly optimistic that it can be: any gencode from version >= 3.22.0 will work with version 4. runtimes. That would provide the same ~2 year compatibility window that we'd aim for in the future.

Why make breaking changes at all?

Most of the "breaking" changes we want to make are just cleanups in the interfaces between gencode and runtime. It is unlikely (though not impossible) that we would make breaking changes in the generated APIs or the core protobuf interfaces (MesageOrBuilder, Message, Message.Builder, MessageLiteOrBuilder, MessageLite, MessageLite.Builder, etc). Even within the monorepo of Google it is very painful to make breaking changes to these interfaces or the generated code APIs.

Most cleanups are to remove legacy codepaths as we evolve the implementation for better performance and reduced maintenance.

How should OSS code use protobuf?

In many ways, we're still figuring this out.

Downstream projects that use generated bindings in their public APIs should update to the most recent protobuf runtime that supports their gencode. This should not be considered a breaking change. Over the next year, libraries should regenerate their gencode with more recent versions. If none of the changes within protobuf are breaking API changes, it seems like this is not a breaking change to the library itself and it is not required to do a major version bump.

While the standard protobuf binary wire format is very stable and a great choice for cross binary APIs, the protobuf gencode is not a good choice for public APIs because of the tight coupling to the runtime. In fact, many of the updates that are considered "safe" from a proto schema evolution perspective (like removing unused fields) are breaking changes in a Java API.

Shading is a valid way to work around version dependency problems in cases where you want to use gencode in public APIs.

github-actions[bot] commented 1 month ago

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago. This issue will be closed and archived after 14 additional days without activity.

zhangskz commented 1 month ago

New shims were added in https://github.com/protocolbuffers/protobuf/commit/6bf01c51a0b92278958f0169d330d64a08dbb4ec that should restore binary compatibility of 3.x gencode with 4.x runtime per the updated "rolling compatibility policy" in the precious comment. This has been released in v4.28.0-rc3 and should be backported to v27.4 shortly as well.

See also https://github.com/protocolbuffers/protobuf/issues/17247 for updates on these issues.