GobySoft / dccl

Dynamic Compact Control Language
Other
17 stars 13 forks source link

Query: Recommended method for deploying a dccl::Codec in .so format? #103

Closed psmskelton closed 1 year ago

psmskelton commented 1 year ago

What is the recommended method for deploying a dccl::Codec in a .so format?

For example, let's assume we have an extensive shared library (.so) containing N messages defined across M .proto files. This .so, along with the requisite *_pb.h files, enables deployment of messages no problems. However, having to manage codec.load() on N*M messages across multiple projects is growing tiring, so we'd like to also provide a pre-loaded dccl::Codec with this library.

We have a deployed (hack) solution for Python that uses some Python shenanigans to overcome the inability to iterate over the google::protobuf::DescriptorPool (NB: Google's design decision to not facilitate that as they use an internal-to-Google .proto file that has every message defined across all of their internal projects, so you can imagine how large that is) to programmatically load all messages, and provides the user with a convenient <library>.get_codec() method that already has the entire set of messages loaded. While we could implement something similar in C++, I believe that would go against the "intended" way of doing things.

I believe you have already considered this as the dccl::Codec definition states:

    /// \brief Add codecs and/or load messages present in the given shared library handle
    ///
    /// Codecs and messages must be loaded within the shared library using a C function
    /// (declared extern "C") called "dccl3_load" with the signature
    /// void dccl3_load(dccl::Codec* codec)
    void load_library(void* dl_handle);

Which, I assume, for runtime configurable/dynamic .so ingestion can be passed a dl_handle as:

dccl::Codec codec_;
dccl::DynamicProtobufManager::enable_compilation();
codec_.load_library(dccl::DynamicProtobufManager::load_from_shared_lib("/usr/local/lib/libsomeprotolibrary.so");

However, a documented process for achieving it appears to be missing in the docs.

tsaubergine commented 1 year ago

Codec::load_library() has two overloads, both do the same thing, but the void* overload assumes you've already called dlopen somewhere, so the std::string overload is probably the easiest to use (see https://libdccl.org/4.0/classdccl_1_1Codec.html).

After opening the .so Codec will call the dccl3_load C function (analogous to a constructor) which must be present in that library:

extern "C"
{
    void dccl3_load(dccl::Codec* dccl)
    {
    }
    void dccl3_unload(dccl::Codec* dccl)
    { 
    }
}

(the dccl_unload function (analogous to a destructor) must be present as well to unload the library). For both of these the Codec* pointer passed is to the Codec that called load_library.

Within these you can call any Codec methods that you want, so for your use you can call load on your Protobuf messages, such as:

    void dccl3_load(dccl::Codec* dccl)
    {
        dccl->load<protobuf::MyDCCLMessage1>();
        dccl->load<protobuf::MyDCCLMessage2>();
    }
    void dccl3_unload(dccl::Codec* dccl)
    { 
        dccl->unload<protobuf::MyDCCLMessage1>();
        dccl->unload<protobuf::MyDCCLMessage2>();
    }

One useful thing would be that I could update the protoc-gen-dccl plugin to optionally output these load/unload snippets automatically at compile time.

Given that, you can load the .so with the following (simpler than you had above):

dccl::Codec codec_;
codec_.load_library("/usr/local/lib/libsomeprotolibrary.so");
psmskelton commented 1 year ago

Ahhh, of course, I see where I was mistaken on the process. Thanks for the clarification!

Having either the descriptors or the load/unload snippets automatically generated could be helpful for teams that have multiple developers using the same core message library. What I've implemented in the meantime is a parsing of the .pb.h files that accompany the .so, looking for:

class <message_class_name> : public ::google::protobuf::Message /* @@protoc_insertion_point(class_definition:<message_full_name>) */ {

Parsing for the <message_full_name> field, and then loading that using:

_codec.load(dccl::DynamicProtobufManager::find_descriptor(<message_full_name>));

As we are controlling what versions of protobuf (etc.) libraries are used, the glass-house fragility of this method is acceptable at the moment. Although fast as these files are <100k lines, we don't really care even if it takes a second or two as we load the .so on instantiation, not when we need it 🤷

tsaubergine commented 1 year ago

Take a look at https://github.com/GobySoft/dccl/pull/106 where I added functionality to protoc-gen-dccl to make it easier to do what you want without relying on fragile preprocessing steps.

I know I owe everyone some documentation on the more advanced features.