pybind / pybind11_protobuf

Pybind11 bindings for Google's Protocol Buffers
Other
54 stars 33 forks source link

proto_caster's as shared library have broken symbol visibility #160

Open StefanBruens opened 2 months ago

StefanBruens commented 2 months ago

Both native and wrapped proto_casters use several functions from proto_cast_util.cc. Unfortunately, most of these functions are not actually usable, as the pybind11 default hidden visibility is propagated to these functions.

This can be verified by analyzing the generated object file:

> readelf -a -CW ./build/CMakeFiles/pybind11_native_proto_caster.dir/pybind11_protobuf/proto_cast_util.cc.o  | grep -E 'FUNC .*GLOBAL'   592: 0000000000001630    41 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::PyBytesAsStringView(pybind11::bytes)
   593: 0000000000001660     3 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::PyProtoGetCppMessagePointer(pybind11::handle)
   594: 0000000000001670   172 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::PyProtoDescriptorFullName[abi:cxx11](pybind11::handle)
   595: 0000000000001720   158 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::PyProtoHasMatchingFullName(pybind11::handle, google::protobuf::Descriptor const*)
   748: 0000000000001ed0  1090 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::PyProtoSerializePartialToString(pybind11::handle, bool)
   829: 00000000000029e0   457 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::CProtoCopyToPyProto(google::protobuf::Message*, pybind11::handle)
   839: 0000000000003790  1089 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::AllocateCProtoFromPythonSymbolDatabase(pybind11::handle, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
   845: 0000000000003be0   104 FUNC    GLOBAL DEFAULT  160 pybind11_protobuf::InitializePybindProtoCastUtil()
   846: 0000000000003c50   274 FUNC    GLOBAL DEFAULT  160 pybind11_protobuf::ImportProtoDescriptorModule(google::protobuf::Descriptor const*)
   847: 00000000000041b0   201 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::GenericPyProtoCast(google::protobuf::Message*, pybind11::return_value_policy, pybind11::handle, bool)
   848: 0000000000004280     8 FUNC    GLOBAL HIDDEN   160 pybind11_protobuf::GenericProtoCast(google::protobuf::Message*, pybind11::return_value_policy, pybind11::handle, bool)

All method using a parameter from the pybind11:: namespace becomes "hidden", and when linking the object file into the shared library the HIDDEN symbols are omitted from the symbol table.

See https://pybind11.readthedocs.io/en/stable/faq.html#someclass-declared-with-greater-visibility-than-the-type-of-its-field-someclass-member-wattributes

google-or-tools e.g. patches the pybind11_protobuf build to create a static library (which is not affected by symbol visibility) to work around this problem:
https://github.com/google/or-tools/blob/2384738a951561bf905f3435651310841c309b4c/patches/pybind11_protobuf.patch#L32

rwgk commented 2 months ago

Tagging @srmainwaring who contributed the cmake support.

Could/should we adopt the or-tools patch here?

StefanBruens commented 2 months ago

It all depends on the architecture you want to have.

Most of the code could just be moved to header-only. Having the implementation in a static library does not provide any benefit, but has several drawbacks:

The big outlier is the GlobalState singleton. That does not fit "header-only". The question is, is this singleton actually required to be a singleton, or is this just an optimization to cache the protocol definitions. I am tempted to vote for the latter to be the case here - otherwise several python modules using pybind11_protobuf (possibly different versions) would not actually work.

Note, having the implementation in a static library get rids of the the symbol visibility problem, but does not solve the ABI problem the limited visibility is meant to solve. The static library may be compiled with one pybind11 version, but the module using it may use a different pybind11 version.

rwgk commented 2 months ago

Most of the code could just be moved to header-only.

I think that's very unlikely to happen. (Just wanted to let you know; very long background omitted.)

StefanBruens commented 1 month ago

Most of the code could just be moved to header-only.

I think that's very unlikely to happen. (Just wanted to let you know; very long background omitted.)

Care to elaborate? As far as I can see, the static library only has downsides.

rwgk commented 1 month ago

Nothing nice, I'm afraid, tbh: currently unclear ownership and governance, "best effort" only (fancy for: no actual staffing). — Hopefully some things will improve in the next few months.

StefanBruens commented 1 month ago

Nothing nice, I'm afraid, tbh: currently unclear ownership and governance, "best effort" only (fancy for: no actual staffing). — Hopefully some things will improve in the next few months.

Ah, so not a policy decision, but "just" lack of time ...