Auto-generated ext_h101 structs

klenze commented 1 year ago

actually you can not, because the spoilsports at ISO do not allow to either define a nameless class nor use a class definition instead of a class name in either a template parameter (pack) or an inheritance specification.

I was wrong. (I am unsure if my C++ is good enough to make this a case of me fulfilling Clarke's first law.)

While you can not directly pass on anonymous classes as template parameters, you can certainly pass lambda expressions returning arbitrary types as template parameters, which is sufficient.

See proof of concept here. (Trigger warnings: C++, preprocessor macros, virtual inheritance, variadic templates).

The gist is that you can run

  All<MKCLASS(Foo), MKCLASS(Bar), MKCLASS(Baz)> a;
  a.onInit();

where a will have members fFoo, fBar and fBaz and onInit will call lambdas which have access to their variables and the names of these variables as a string.

For zero suppressed 1d arrays, we could have a macro which takes a name Foo and defines Foo, FooI and Foov appropriately, as well as calling the correct EXT_STR_ITEM_INFO_... for them.

Arguments could be easily added to the onInit and the lambda expressions. The handling of n-dimensional arrays is left as an exercise for the reader. So is passing stuff to constructors, if one needed that for some reason.

I think in the spirit of "Don't Repeat Yourself" (DRY), it is better to have this in one place instead of having longish macros saying map the variable Foo to "Foo" for 100s of variables.

YanzhaoW commented 1 year ago

I don't quite get it. What is its advantage compared to directly defining the struct?

struct MyStruct{
    uint_16 fFoo;
    uint_16 fBar;
    uint_16 fBaz;
};
auto a = Data<MyStruct>{};

klenze commented 1 year ago

Because that way, you end up with stuff like this, e.g. each variable needs three lines of macros like:

  EXT_STR_ITEM_INFO_LIM(ok,si,offset,struct_t,printerr,\
                     CALIFA_TRGENE,                   UINT32,\
                    "CALIFA_TRGENE",2); \

By contrast, my way needs only a different macro invocation instead of macro definitions to be able to do the equivalent of the above.

Naturally, my code is still a toy example. In reality, you would have to call ext_data_struct_info_item (or one of the wrapper macros) in fOnInit. I guess it would be simplest to have fOnInit::value_type take a (basically void) pointer to the beginning of the actual struct and just calculate the char* difference to &(this->VARNAME).

When R3BUcesbSource is rewritten, there is nothing to stop us from using template<typenname ReaderT> AddReader(std::string prefix) which could

check that ReaderT::h101_t still fits on the buffer allocated for all the h101 objects
Use placement new to allocate the h101: new (buffer+current_offset) ReaderT::h101_t(buffer, prefix), passing the struct both the string prefix (used to decide which names to map) and the start of buffer pointer (used to calculate offsets)
Instantiate ReaderT and give it a pointer to the just created h101
increment current_offset by sizeof(h101)

While it would be tempting to also autogenerate the FooReader code altogether, it is probably not feasible for readers due to resource overhead.

(Of course, there is the way to define every class as a template <typename Base> class Foo: public Base {...}; and then create the actual class by some template<template<typename> typename... classes> struct AllHelper, which just cascades all of the argument templates (and uses a boring empty class as the base case). This would be completely inlineable without any lambda expressions having to be evaluated at runtime. On the other hand, the embed-in-a-lambda-type trick would then involve type variadic lambda expressions which retrieve the type of their argument and use this as the base type for making their stuff inherit from, but that might lead to code which is kind of hard to understand, especially when having to add the processing steps we actually do in the readers (like going from 4x uint16_t -> uint64_t for wrts)).

YanzhaoW commented 1 year ago

Alright, maybe we could first see what are the "industrial standards" to serialise or deserialise C++ data structures.

As far as I know, the most popular one is google protobuf. It's very similar to our home-made ucesb, which requires users to define the data structure in a separate file. Then it generates a header file containing the corresponding C++ structure.

"cereal" is another type of serialisation tool, which instead requires user to define a serialisation method inside the data structure, such as:

struct MyRecord
{
  uint8_t x, y;
  float z;

  template <class Archive>
  void serialize( Archive & ar )
  {
    ar( x, y, z );
  }
};

And then you call a global function to serialise/deserialise the data structure.

The third method (the best one in my opinion) can be found in libraries like "boost::serialization" or "zpp_bits", which requires nothing from users except the plain definition of the C++ structure:

struct MyStruct{
    uint_16 fFoo;
    uint_16 fBar;
    uint_16 fBaz;
} myStruct;

auto data = std::vector<std::byte>{};
Serialize(data, myStruct);
Deserialize(data, myStruct);

And what is blowing my mind is that the third one is very simple to implement and there is no heavy usage on C macros. The trick of imitating a reflection on data structure is using the structure binding. But we need to numerate every single case of such kind of bindings:

if constexpr (size == 2) auto&& [m1, m2] = myStruct;
else if constexpr (size ==3) auto && [m1, m2, m3] = myStruct;
//...

And this can get quite long.

So the best solution is to wait for the new feature of C++26, when we can do something like:

auto && [ ... m ] = myStruct;

R3BRootGroup / R3BRoot

Auto-generated ext_h101 structs #865