Open atniomn opened 2 years ago
This issue is also described in https://github.com/conan-io/conan/issues/11193, but I don't know if it's fully fixed in conan v2.
Hi @atniomn, thank you for raising this issue! Protobuf is always an interesting one :)
I believe there are mutiple aspects of compatibility that you need to take into account when using Protobuf in Conan packages:
.pb.h
and the headers from the protobuf libraryThis is the case you are reporting. If you have a Conan package that exposes .pb.h
files as public headers, the downstream consumers of this package need to be using the same version of the protobuf library (.pb.h
headers include headers from this). A recent version of protobuf performs a check like this:
#if PROTOBUF_VERSION < 3018000
#error This file was generated by a newer version of protoc which is
#error incompatible with your Protocol Buffer headers. Please update
#error your headers.
#endif
#if 3018001 < PROTOBUF_MIN_PROTOC_VERSION
#error This file was generated by an older version of protoc which is
#error incompatible with your Protocol Buffer headers. Please
#error regenerate this file with a newer version of protoc.
#endif
By the looks of it the check is so narrow that in practice you will need the same exact version. As you have mentioned, you would have to override the semver_direct_mode
and use something more strict, like full_version_mode
. You can do this only for the Protobuf dependency and not globally, you can see examples here.
In a scenario where you have multiple libraries that have compiled .pb.cc
files inside them, and these libraries live in different conan packages, if you have an executable that needs to load these libraries at runtime, the protobuf library loaded by the linker (or statically linked into the executable), needs to be compatible and provide the symbols expected by all those .pb.cc
files. If those .pb.cc
files were created by different versions of protoc
against different versions of the protobuf
headers, but the executable only loads one version - it may not work.
Protobuf themselves recommend recompiling all .pb.cc
files against the same version of the library to avoid these issues, as they do not guarantee ABI compatibility. See here.
Note that I'm listing this as a separate case from the one above involving the headers, because it is possible to have multiple shared libraries that have .pb.cc
in them, but not exposing any .pb.h
headers publicly. This would effectively isolate the protobuf layer to where it's needed, and expose an abstraction on top of it. I believe this is the approach follow by the Google API Client libraries, although they no longer maintain C++ libraries.
.proto
filesAs you suggest, it is also possible to bundle .proto
files so that consumers "build" them themselves - this would ensure at least that, for those consumers, compatibility between .pb.h
and the library headers, and between the generated code and the runtime library, are guaranteed.
However, because .protos
are compiled into C++, we need to pay special care: depending on how they are used and how they evolve, we can cause source incompatibilities, runtime errors, or protobuf wire protocol issues. For example:
C++ source compatibility: If package Foo provides Foo.proto
, which is then imported by package Bar, and it is Bar that generates the .pb.h
and .pb.cc
files - any changes to field names (among other things) will result in different C++ being generated. So it is possible that a newer version of package Foo
is released, and when Bar
consumes it the code in Bar
that interacts with the classes defined in .pb.h
may need to be altered as well. This is a C++ source comp
Binary compatibility - If a package provides Foo.proto
, and multiple other packages compile the generated .pb.cc
files - if there is any chance that the same executable downstream may load these two independent libraries as dependencies, we would be dealing with symbols that are defined multiple times, which would result in ODR violations. If there is any chance that multiple libraries contain symbols for the same Foo.proto
file, but each compiled a different variation of it (e.g. due to versioning) - that also increases the risk of issues.
Wire compatibility: While Protobuf is designed to enable extending the .proto
message definitions without breaking the wire protocol (that is, the ability to correctly decode the binary representation of a message, where the creator of the message may be based on an older or newer version of the .proto
definition) - it is also very easy to break wire compatibility. If you are using Conan, and choose to bundle .proto
files for consumers to use, and both producers and consumers of messages are built these .protos
coming from Conan packages, you'll need to make sure you carefully craft a versioning strategy such that breaking changes are reflected. If you have two consumers exchanging protobuf data, and the message definition is different in both of them due to a breaking change - this may cause issues that are very hard to detect even at runtime (a message can still decode fine, but with a changed meaning for instance). You would have to come up with a versioning strategy for the Conan packages such that these incompatibilities are captured and detected to avoid this. The buf
tool (see docs here) is good at detecting breaking changes.
As a summary:
If you are building a project using Conan dependencies, and anywhere in the dependency graph uses protobuf, you want to make sure that the entire graph used the same version of protobuf, or cause Conan to report missing binary packages if the version of protobuf is changed for any. This is indeed achieved by full_version_mode
.
If you have .pb.cc
for the same .proto
file compiled inside multiple libraries, you want to make sure that an executable only ever loads one copy - otherwise you may run into ODR violations or symbol issues.
If you have the same .proto
in multiple packages/libraries, and can guarantee the point above (only one definition in an executable) - if you have two separate executables that need to exchange protobuf binary data for the message defined in a given .proto
- you want to ensure that the .proto
message definition follows Google guidelines and does not break the wire compatibility.
As a side note, the case mentioned by @SpaceIm is slightly different, but also related. It is typical for a protobuf
package to contain both the compiler (protoc
) and the library (libprotobuf
). Strictly speaking, the compiler is only needed at build time, but it is not needed for consumers, while the library is needed for consumers (both its headers and the runtime library). If you express the dependency on protobuf
as a tool requirements, then indeed one needs to make sure that both versions are exactly the same. This would be a requisite if we want to cross-build any project that requires running protoc
on the build platform, but use the libprotobuf
from the host platform).
@atniomn I'm assuming that in the case you are describing, you are just consuming protobuf
as a regular requirement.
I would like to see an enhancement to that examples which involves multiple repositories and outlines best practices on packaging and sharing .protos, .pb.h and *.pb.c files.
If you find any of the above explanations useful, we could try and demonstrate some of these cases. However, the management of Protobuf definitions at scale where they may live in different repositories still remains an open problem in the industry - even without taking C++ into account! Recently I have seen projects use buf lockfiles: https://buf.build/ - to try and at least avoid issues with incompatibilities at the message definition level. They provide a good overview of the current challenges: https://docs.buf.build/introduction.
Hi @atniomn
Some important things have changed since then, and now Conan 2 is the mainstream version.
This Conan 2 always use now the dual "build" and "host" profiles and context, allowing to put protobuf package both in the build and host context for cross-building. The docs for Conan 2 also contain a dedicated example about it: https://docs.conan.io/2/examples/graph/tool_requires/using_protobuf.html
Also, there are some other features like the self.tool_requires("pkg/<host_version>")
to allow an easier tracking of the same version in the host and build context.
I'd like to ask, if there are any further questions, regarding to this ticket and the explanations above from @jcar87. Or any other issue or question you might have, please don't hesitate to create new tickets for it, and we'll try to help. Many thanks for your feedback!
My firm has begun a widespread adoption of conan. We have dozens of separate repositories, each of which produce a conan package. In the first attempt to adopt conan, some individuals decided to include generated protobuf headers in their conan packages. This will cause a compile-time error if an older version of protoc was used to generate those headers (e.g. protobuf/3.6.1), than the version being used by the current project (e.g. protobuf/3.17.1):
By default, in the semver_direct_mode, conan will not complain about dependencies using protobuf/3.W.X vs protobuf/3.Y.Z, but if you include generated headers, you will run into the above compile-time error. To me, it seems obvious that if you include generated protobuf headers in your package, you should use full_package_mode.
An alternative approach would be to package only the .proto files and have consumer packages import them in their imports() methods, and then have consumer CMakeLists.txt generate the .pb.cc and *.ph.h files.
I found the protobuf example provided in the examples repository to be very limited. It only demonstrates how to generate .pb.c and .pb.h files for consumption by the same package which contains the .proto. I would like to see an enhancement to that examples which involves multiple repositories and outlines best practices on packaging and sharing .protos, .pb.h and .pb.c files.
Additionally, given the complexity of this use case, I would really some guidelines around best practices.