conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.23k stars 980 forks source link

Packaging proto best practice #11572

Open atniomn opened 2 years ago

atniomn commented 2 years ago

My firm has begun a widespread adoption of conan. We have dozens of separate repositories, each of which produce a conan package. In the first attempt to adopt conan, some individuals decided to include generated protobuf headers in their conan packages. This will cause a compile-time error if an older version of protoc was used to generate those headers (e.g. protobuf/3.6.1), than the version being used by the current project (e.g. protobuf/3.17.1):

#error This file was generated by an older version of protoc which is
#error incompatible with your Protocol Buffer headers. Please
#error regenerate this file with a newer version of protoc.

By default, in the semver_direct_mode, conan will not complain about dependencies using protobuf/3.W.X vs protobuf/3.Y.Z, but if you include generated headers, you will run into the above compile-time error. To me, it seems obvious that if you include generated protobuf headers in your package, you should use full_package_mode.

An alternative approach would be to package only the .proto files and have consumer packages import them in their imports() methods, and then have consumer CMakeLists.txt generate the .pb.cc and *.ph.h files.

I found the protobuf example provided in the examples repository to be very limited. It only demonstrates how to generate .pb.c and .pb.h files for consumption by the same package which contains the .proto. I would like to see an enhancement to that examples which involves multiple repositories and outlines best practices on packaging and sharing .protos, .pb.h and .pb.c files.

Additionally, given the complexity of this use case, I would really some guidelines around best practices.

SpaceIm commented 2 years ago

This issue is also described in https://github.com/conan-io/conan/issues/11193, but I don't know if it's fully fixed in conan v2.

jcar87 commented 2 years ago

Hi @atniomn, thank you for raising this issue! Protobuf is always an interesting one :)

I believe there are mutiple aspects of compatibility that you need to take into account when using Protobuf in Conan packages:

Between generated .pb.h and the headers from the protobuf library

This is the case you are reporting. If you have a Conan package that exposes .pb.h files as public headers, the downstream consumers of this package need to be using the same version of the protobuf library (.pb.h headers include headers from this). A recent version of protobuf performs a check like this:

#if PROTOBUF_VERSION < 3018000
#error This file was generated by a newer version of protoc which is
#error incompatible with your Protocol Buffer headers. Please update
#error your headers.
#endif
#if 3018001 < PROTOBUF_MIN_PROTOC_VERSION
#error This file was generated by an older version of protoc which is
#error incompatible with your Protocol Buffer headers. Please
#error regenerate this file with a newer version of protoc.
#endif

By the looks of it the check is so narrow that in practice you will need the same exact version. As you have mentioned, you would have to override the semver_direct_mode and use something more strict, like full_version_mode. You can do this only for the Protobuf dependency and not globally, you can see examples here.

Runtime protobuf library

In a scenario where you have multiple libraries that have compiled .pb.cc files inside them, and these libraries live in different conan packages, if you have an executable that needs to load these libraries at runtime, the protobuf library loaded by the linker (or statically linked into the executable), needs to be compatible and provide the symbols expected by all those .pb.cc files. If those .pb.cc files were created by different versions of protoc against different versions of the protobuf headers, but the executable only loads one version - it may not work.

Protobuf themselves recommend recompiling all .pb.cc files against the same version of the library to avoid these issues, as they do not guarantee ABI compatibility. See here.

Note that I'm listing this as a separate case from the one above involving the headers, because it is possible to have multiple shared libraries that have .pb.cc in them, but not exposing any .pb.h headers publicly. This would effectively isolate the protobuf layer to where it's needed, and expose an abstraction on top of it. I believe this is the approach follow by the Google API Client libraries, although they no longer maintain C++ libraries.

.proto files

As you suggest, it is also possible to bundle .proto files so that consumers "build" them themselves - this would ensure at least that, for those consumers, compatibility between .pb.h and the library headers, and between the generated code and the runtime library, are guaranteed.

However, because .protos are compiled into C++, we need to pay special care: depending on how they are used and how they evolve, we can cause source incompatibilities, runtime errors, or protobuf wire protocol issues. For example:

As a summary:

As a side note, the case mentioned by @SpaceIm is slightly different, but also related. It is typical for a protobuf package to contain both the compiler (protoc) and the library (libprotobuf). Strictly speaking, the compiler is only needed at build time, but it is not needed for consumers, while the library is needed for consumers (both its headers and the runtime library). If you express the dependency on protobuf as a tool requirements, then indeed one needs to make sure that both versions are exactly the same. This would be a requisite if we want to cross-build any project that requires running protoc on the build platform, but use the libprotobuf from the host platform). @atniomn I'm assuming that in the case you are describing, you are just consuming protobuf as a regular requirement.

I would like to see an enhancement to that examples which involves multiple repositories and outlines best practices on packaging and sharing .protos, .pb.h and *.pb.c files.

If you find any of the above explanations useful, we could try and demonstrate some of these cases. However, the management of Protobuf definitions at scale where they may live in different repositories still remains an open problem in the industry - even without taking C++ into account! Recently I have seen projects use buf lockfiles: https://buf.build/ - to try and at least avoid issues with incompatibilities at the message definition level. They provide a good overview of the current challenges: https://docs.buf.build/introduction.

memsharded commented 1 week ago

Hi @atniomn

Some important things have changed since then, and now Conan 2 is the mainstream version.

This Conan 2 always use now the dual "build" and "host" profiles and context, allowing to put protobuf package both in the build and host context for cross-building. The docs for Conan 2 also contain a dedicated example about it: https://docs.conan.io/2/examples/graph/tool_requires/using_protobuf.html

Also, there are some other features like the self.tool_requires("pkg/<host_version>") to allow an easier tracking of the same version in the host and build context.

I'd like to ask, if there are any further questions, regarding to this ticket and the explanations above from @jcar87. Or any other issue or question you might have, please don't hesitate to create new tickets for it, and we'll try to help. Many thanks for your feedback!