Closed GoogleCodeExporter closed 9 years ago
You should regenerate your code using the 2.5.0 compiler when using the 2.5.0
library.
Original comment by Oliver.J...@gmail.com
on 5 Apr 2013 at 2:51
Original comment by xiaof...@google.com
on 5 Apr 2013 at 5:18
This is *really* unfortunate, given that 2.5 generated code won't compile
against 2.4 libraries. It means that the entire Hadoop ecosystem needs to move
protobuf versions synchronized with each other. Can I suggest that if you are
going to break compatibility, that you make it fail at compile time rather than
run time so that the error will be caught during compilation?
Much much better of course is to not break compatibility. Using reflection to
determine the version of the protobuf library and use new features if they are
available would be the preferred solution.
Original comment by owen.oma...@gmail.com
on 20 Aug 2013 at 10:04
I ran into this problem because I had protoc 2.4.1 installed on my system and
the Maven build I was invoking was using it to try to generate code that was
using the 2.5.0 library. protoc returned without error and the generated code
compiled, but it caused this runtime error because of the incompatibility.
These kind of silent error cases can be a real pain to track down if you aren't
a developer that is explicitly working on or integrating protobuf in your
project.
Original comment by deinspanjer
on 10 Jan 2014 at 5:31
protobuf is embedded into many hadoop libraries (pig, hadoop, hive,...),
breaking compatibility is definitly a bad idea :(
Original comment by hisanth...@gmail.com
on 18 Mar 2014 at 5:26
To keep the compatibility, i do a little hack in getUnknownFields and deploy a
new version protobuf-java. It works fine after some tests.
I don't know if there are some potential problems? thx
Original comment by lshmouse
on 21 Mar 2014 at 3:02
Attachments:
Can a project member please comment on the patch posted in comment #6. It is a
must have to understand what the limitations of running a patched version of
protobufs. A fix so simple to achieve compatibility would be committed unless
there are some caveats.
Original comment by jeag...@gmail.com
on 10 Jun 2014 at 3:32
We don't want to provide this runtime compatibility across different versions.
If you are using protoc 2.5, the corresponding 2.5 runtime library must be
used. In the future we may update the generated code to report a version
mismatch error (like what we already did for C++) for such cases so users will
be clear what's going wrong.
Original comment by xiaof...@google.com
on 10 Jun 2014 at 6:01
Help us understand why you consider cross version compatibility between the
generated code and library problematic.
It certainly has caused many of us pain and has damaged Protobuf's reputation
among developers. To me, seems like another form of API incompatibility.
Original comment by owen.oma...@gmail.com
on 10 Jun 2014 at 9:42
Protoc and the runtime together forms the protobuf library. Taking protoc in
one version and trying to use it with the runtime in another version is like
taking half of a library, then taking another half from the same library but in
a different version and expecting that to work. AFAIK, protobuf has never tried
to provide such compatibilities. Protobuf generated code and the runtime are
strictly tied together both in its design and implementation. The connection
between generated code and the runtime library is protobuf internal
implementation details. It's not part of the protobuf API and we don't
guarantee it to be stable. Users who experience this problem should update
their project to use the same version for both halfs of protobuf library
instead.
Original comment by xiaof...@google.com
on 10 Jun 2014 at 10:29
For your own project, regenerating the code is not unreasonable.
The problem comes when you use ant/ivy or maven to pull in a jar that you
depend on and that dependent jar uses protobuf. It isn't clear that they use
protobuf and it certainly isn't clear which version of protoc they used. If
they happened to use a different protoc version then you'll get runtime error
with a cryptic error message. For example, if I add the dependence:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce</artifactId>
<version>2.2.0</version>
</dependency>
I need to investigate whether it uses protobuf and which version. Otherwise, it
will blow up at runtime. Furthermore, I can only use versions of the many
different open source projects that use exactly the same version of protobuf.
Original comment by owen.oma...@gmail.com
on 11 Jun 2014 at 3:40
In this case, why doesn't hadoop-mapreduce have a dependency on the library
version it requires for correct operation?
Original comment by Oliver.J...@gmail.com
on 11 Jun 2014 at 6:54
The "can only use one version" issue is not a problem specific to protobuf;
it's a general problem with common libraries. In cases where the common library
objects are externally used, you're pretty much stuck with picking a version
and making sure everything is OK with that. For cases where the dependencies
are purely implementation dependencies, you can either isolate the libraries in
separate classloaders, or use something like "jarjar" so the dependencies don't
collide.
Original comment by Oliver.J...@gmail.com
on 11 Jun 2014 at 7:04
This is more than "can only use one version." Protobuf requires that you run
with precisely the version of the library that all of your components compiled
were with. So you if you want to use protobuf 2.5 for your project, but you
need to use project OpenFoo that was compiled with protobuf 2.4, you need to
get the sources and recompile instead of simply using the binaries from Maven
Central. Now repeat with each of the dozens of projects that you depend on.
Additionally, most of the linux distros just give you a single version of
protoc. That means that unless you are using the default for your distro, you
need to fetch protoc and compile it too.
None of this is the end of the world, of course. Once you know about the
problem, you look for it and make sure you use the right version of protoc. But
it is equivalent to a version incompatible change in each release of protobuf,
which is sub-optimal for users who don't recompile the world by default.
Original comment by owen.oma...@gmail.com
on 11 Jun 2014 at 9:11
protobuf requires that if you have some generated class G that was generated
using protoc version V, then G must resolve the protobuf support classes at
runtime to protobuf library version V.
So if you are distributing the generated code G, then have it declare a
dependency on protobuf library version V. I thought this was the case you were
describing above with hadoop-mapreduce; is this not what you meant?
If you are *building* G, then you need to make sure that the declared
dependencies of G match the version of protoc you actually used. This isn't too
hard to do with Ivy/Ant: I use an ivy module that provides an Ant task that
builds with a particular protoc version, and a configuration with a transitive
dependency on the matching runtime library version. I don't know about doing
the equivalent with Maven.
In your example with OpenFoo, either (a) OpenFoo happens to use protobuf as an
implementation detail, and you can use one of the approaches I suggested in my
previous comment; or (b) OpenFoo exposes protobuf classes as part of the API
you want to deal with in your main project, and you must make sure that all
parts of that interacting system have the same version of protobuf - this _is_
the "can only use one version" case again. You have two options here: (1) use
protobuf 2.4 for your project; (2) create or build a new version of OpenFoo
that supports 2.5.
As far as I can see this really looks no different to any other case where you
have two pieces of interacting code that want to use the same library, but
there are incompatibilities in the two library versions that they want to use.
Original comment by Oliver.J...@gmail.com
on 11 Jun 2014 at 9:42
I agree with @owen's comments. Yes, it is ideal that all users of hadoop
upgrade to the protobuf version that the hadoop framework uses. But that is
easy only if you have a closed dependency system. In reality that is neither
easy nor practical.
With the backward compatibility broken between 2.4.1 and 2.5.0, essentially we
would be forcing ALL users of hadoop to upgrade to 2.5.0. The dependencies are
not defined in isolation. Sometimes they come in through transitive
dependencies (e.g. a library you happen to depend on has some protobuf classes
generated with 2.4.1). So not only you need to upgrade your protobuf version,
but you also need to hunt down all the transitive cases, finding a new version
of that library that uses 2.5.0. It may not even exist.
This could be made so much easier on the user's part if protobuf 2.5.0 is
backward compatible with 2.4.1, as the versioning may suggest semantically.
Original comment by sjl...@gmail.com
on 17 Jun 2014 at 5:43
@xiaof...
> Users who experience this problem should update their project to use the same
version for both halfs of protobuf library instead.
This presumes that developers have full control over the protobuf libraries
used. This is not always the case. If I use third-party library A (which was
compiled with, and is distributed with protobuf 2.4.1) and third-party library
B (which was compiled with, and is distributed with protobuf 2.5) then
whichever library I choose, one or other library will not work[*]. I may not
have the choice to recompile the third party libraries with different versions
of protobufs. This might create a divisive dichotomy in the ecosystem of 3rd
party libs: those still on pb <2.4.1 and below, and those on pb >2.5, and never
shall one half be used in the same application as the other half.
A suggestion might be to fix up the libraries so that both versions can live
side by side - for example by tweaking the package name to include the version
number? (on breaking changes only).
[*] I'm assuming the problem goes both ways, I haven't actually tested this
yet...
Original comment by baca...@gmail.com
on 25 Jul 2014 at 8:41
Hi bacaruk,
It seems the problem you mentioned can be applied to any library. For example,
suppose you open-sourced a library some_library version 1.0 and then upgraded
it to some_library version 2.0. How would you combine a third-party library A
using version 1.0 with another third-party library B using 2.0 in the same
project?
I agree that making generated code of 2.4.1 compatible with runtime 2.5.0 (or
vice versa) will help in some ways, but it would requires a significant
change/redesign of the protobuf library to guarantee such compatibility. In the
future when we need to make significant changes to protobuf (or implement it
for new languages) we can consider this but for now it's not worthwhile to
re-architect protobuf implementation solely for this purpose.
Original comment by xiaof...@google.com
on 25 Jul 2014 at 5:29
Making the 2.4 generated code continue to work with the 2.5 library should not
be a large burden. There was a patch up above to fix the problem that was
submitted 4 months ago.
Original comment by owen.oma...@gmail.com
on 25 Jul 2014 at 6:11
True, it's definitely equally applicable for other libraries. It's a shame when
other libraries do that too :-)
Does the protobuf team publish a specific versioning/compatibility policy? It's
important for library developers to understand that upgrade their dependency to
a particular version may cause problems.
In this instance, is this the only breaking change, and if so is there a
specific reason the break has to be there? It's not clear to me whether the
submitted patch does an appropriate thing.
Thanks
Baris
Original comment by baca...@gmail.com
on 25 Jul 2014 at 8:22
Sorry to re-surrect this, but we've run into this exact same issue when trying
to move to the CDH5 distribution of Hadoop, which is using Protobuf 2.5.0,
while our code base is still using Protobuf 2.4.1.
I'd like to find out if the submitted patch is complete or if there are others
places where patches need be applied...
Original comment by verun.ka...@gmail.com
on 20 Feb 2015 at 4:29
We do see one other issue regarding the compatibility between 2.4.1 and 2.5:
TextFormat.print***() methods changed from taking Message as an argument to
MessageOrBuilder. This causes a NoSuchMethodError when the versions are
mismatched.
We had to patch it by overloading and adding back the old methods on top of 2.5.
FWIW, this is one of the most frequent (and most painful) user code issues we
encounter with Hadoop 2.
Original comment by sjl...@gmail.com
on 20 Feb 2015 at 5:18
Original issue reported on code.google.com by
this.goe...@googlemail.com
on 5 Apr 2013 at 1:31