gnuradio / pmt

pmt
GNU Lesser General Public License v3.0
11 stars 11 forks source link

Struct reflection #81

Closed jsallay closed 1 year ago

jsallay commented 1 year ago

Basic test of using refl-cpp to convert structs to pmts. There are still a lot of questions.

I would love feedback from @mormj and @RalphSteinhagen. I looked at the use cases here: https://quick-bench.com/q/BogFXD0XKHqR5RJPScaYY42HTW4 . It looks like that was done with tuples and arrays. I'm not sure how I would work that with pmts since the variant has to be known at compile time. I put everything in a pmt map which is definitely slower than tuples, but is generic. The code is in no way complete, but it more of me experimenting with the idea.

I wrote a few faster versions that use vectors instead of maps. I attempted to create constexpr versions but discovered that the current c++ spec doesn't allow for returning a dynamically allocated object (ie vector or string) from a constexpr function. So I don't think I can do much better with the structure that I have. (Though I am totally open to better ways of doing it.)

jsallay commented 1 year ago

I talked with @RalphSteinhagen and he showed me some ideas for a more type safe version of std::any https://godbolt.org/z/35T8E9axv and https://github.com/kelbon/AnyAny . I would predict that these would be probably a few orders of magnitude faster than my implementation which is great.

My primary concern is that the full type information isn't carried with the data. This means that any block that wants to interpret the data has to know the structure definition. I am concerned about how this works with (de)serialization. A common pattern in GNURadio is to have a data block that produces the pmt and then we have a network block that serializes it. The easiest way to handle this would be to make serialize a template function and pass in the struct type. In the case of a c++ dynamically generated flowgraph we could probably work something out.

In the case of a python flowgraph, I don't see a good solution. A user would like to be able to create and run a python flowgraph without having compilers installed. We couldn't precompile the full set of serializers in this case.

As I've been thinking about it, I came up with the idea of splitting the structure definition from the actual structure data. We could create a static pmt for the structure definition that contains field names, types and any other needed information (like offsets unless we convert to a tuple). When we convert a struct to a pmt, we would include the static structure pmt and the data pmt as an any. We could probably work something out so that the structure pmt wasn't passed with the data every time.

A serializer block would be able to use the structure definition to know how to handle each field in the struct (presuming that we are limiting ourselves to PODs + vectors and strings). In fact any block would be able to interpret the data.

I have not put a lot of effort into this idea so there could very well be holes, but I really like the idea of a pmt being interpretable by anything that receives it.

RalphSteinhagen commented 1 year ago

@jsallay sorry for the late reply ...

In my mental model, there are two aspects:

  1. exchanging/translation of settings and generic data between nodes and from nodes to, for example, the UI using pmtvs
    (<-> largely async messages and stream-tags)
  2. supporting user-defined types that can be exchanged between nodes (<-> high-freq sampled/synchronous streaming data)

using pmtv for node settings, messages, tags ...

Here the idea is to use std::map<std::string, pmt::pmtv> to limit the heterogeneity/number of public interfaces and types that the blocks implement and that need to be known by the UI, flow-graph/settings management, or for exchanging data between nodes. As a code snippet example/idea/proposal (here with a pre-version of Derek's idea of having port names available using IDE auto-completion):

struct node_settings { // node domain object specialised/derived/modified (as needed)
    float samp_rate = 44100.f;
    int scaling_factor = 1;
};
ENABLE_REFLECTION_FOR(node_settings, samp_rate, scaling_factor) // <- refl::cpp macro

template<typename T, std::size_t N_MIN = 0, std::size_t N_MAX = N_MAX, port_domain_t domain = CPU>
class my_node : public gr::node<my_node<T, N_MIN, N_MAX, domain>>, Setting<node_settings> {
public:
    gr::IN<T, N_MIN, N_MAX> in0;
    gr::IN<T, N_MIN, N_MAX> in1;
    gr::OUT<T, N_MIN, N_MAX> out;

    template<fair::meta::t_or_simd<T> V>
    [[nodiscard]] constexpr V
    process_one(V value1, V value2) const noexcept {
        const auto scale = settings.scaling_factor; // <- type-safe compile-time retrieving of settings
        return  scale * some_math_func(value1, value2); // ... do some operation with values from ports in[0,1]
    }
};
ENABLE_REFLECTION_FOR((my_node, float, double, int), in0, in1, out) // <- refl::cpp macro for my_node

and gr::node<...> implementing the getter/setter that are either invoked by the UI and/or when a tag arrives via one the input ports and if the map contains keys that match one of the node's settings field names and its type the corresponding type stored in the pmtv:

void set_setting(std::map<std::string, pmt::pmtv> new_setting);   // map might contain only subset of keys/fields
std::map<std::string, pmt::pmtv> get_setting(std::string_view key_name...); // return subset (/all)  settings (if empty)

The setter will need to implement a two-stage commit in order to be thread-safe:

So, what's needed would be a (de-)serialiser that convert pmtv-maps from/to structs. In terms of signature, I think an in-place modifying function signature might be better suited (avoids struct allocation) than returning a new struct. Ie. the signature could look something like:

typename<ReflectableType T>
<meta info> [de]serialise(T& domain_object, std::map<std::string, pmt::pmtv> map);

The same type of signature could be also used to (de-)serialise into/from an arbitrary wire-format ala:

typename<WireFormat F>
<meta info> [de]serialise(ByteBufferType& io_buffer, std::map<std::string, pmt::pmtv> map);

For transmitting pmtv between node's on the same CPU, I think using a std::map<std::string, pmt::pmtv> would probably the easiest/fastest, but when crossing domain boundaries (e.g. CPU->GPU, CPU->NET) the byte buffer might be best. There is a plethora of possible binary/text wire-formats to choose from (YaS, protobuf, flexbuffer, fastbuffer, JSON, YML, ...).

supporting user-defined types ...

Presently, GR does not support nor allow for user-defined types that can be transmitted between nodes. This was the other proposal to consider to extend the existing default_supported_types to:

using default_supported_types = std::tuple<bool,
                                           uint8_t, uint16_t, uint32_t, uint64_t,
                                           int8_t, int16_t, int32_t, int64_t,
                                           float, double, std::complex<float>, std::complex<double>,
                                           DataSet<float>, DataSet<double>, Any>; // <- new data types

Here the Any does not replace the use of pmt::pmtv. The use of 'variant' is IMO faster because it constrains the visitor pattern to a countable number of types that need to be visited and handled by all participating blocks. Notably, the visitor pattern/std::visit would warn you about a type that isn't handled and would work with any core or OOT defined block (<-> contract).

On the other side, the Any is here more as an extension point for OOT modules that may also move any object between OOT nodes that "know" the specific user-type during compile time. Other nodes -- notably those defined in the core -- may pass the object stored as Any w/o inspecting but wouldn't be able to do any useful work/algorithms with it.

This is where the std::map<std::string, pmt::pmtv> is different because every node can make use of parts of the information contained therein as long as it knows the 'key' and can convert the pmtv to its own internal convertable type.

Maybe one should add the std::map<std::string, pmt::pmtv> to the default_supported_types ... :thinking:

Hope this makes sense...