veselink1 / refl-cpp

Static reflection for C++17 (compile-time enumeration, attributes, proxies, overloads, template functions, metaprogramming).
https://veselink1.github.io/refl-cpp/md__introduction.html
MIT License
1.05k stars 76 forks source link

Q: smarter use of find_one or for_each for non-constexpr use-cases? #50

Closed RalphSteinhagen closed 3 years ago

RalphSteinhagen commented 3 years ago

Hi,

I found some more time to delve into the details. Overall, this refl-cpp has a very smart design albeit it takes time to get used to the various (necessary) templates and paradigms. Actually, I am quite happy that my code became significantly more compact and type-safe(r) compared to previous incarnations.

I have a follow-up usability question: is there a smarter way to find and set (nested) field members based on run-time filter criteria and data while retaining the constexpr'-ness of the field'stype` information?

For starters: I am aware of the binding example as well as earlier questions (issues) here and here that may or may not be related to this.

I wrote a MVP example to illustrate my specific problem/question of 'dynamically setting field variables based on external/run-time filter criteria and data' (see also compiler-explorer). N.B. here std::string fieldName is being used as a stand-in for the filter criteria but the real-world application also contains matching the field's member(value) type information etc:

#include <https://raw.githubusercontent.com/veselink1/refl-cpp/master/include/refl.hpp>
#include <iostream>
#include <fmt/format.h>

struct Point {
    int e0 = 1;
    int e1 = 2;
    int e2 = 3;
};
REFL_AUTO(type(Point),field(e0),field(e1),field(e2))

constexpr auto type = refl::reflect<Point>();

template<typename T>
int setField(const T& value, const std::string_view fieldName /* unused */) {
    constexpr auto field = refl::util::find_one(refl::reflect<T>().members, [&fieldName](auto m) { return m.name == "e0"; }); // N.B static field name
    // field(value) = ... ; // write data to member field (run-time)
    return field(value);
}

template<typename T>
int setField2(const T& value, const std::string_view fieldName) {
    auto field = refl::util::find_one(refl::reflect<T>().members, [&fieldName](auto m) { return m.name.str() == fieldName; }); // does not work/not constexpr
    //field(value) = ; // write data to member field (run-time)
    return field(value);
}

template<typename T>
int setField3(const T& value, const std::string_view fieldName) {
    int retValue = -1;
    refl::util::for_each(refl::reflect<T>().members, [&fieldName, &value, &retValue](auto field) {
            if (field.name.str() == fieldName) { // comparison OK but looping over each field              
                //field(value) = ; // write data to member field (run-time)
                retValue = field(value);
            }
    });

    return retValue;
}

int main(int argc, const char** argv) {
    Point a;
    const std::string fieldName = fmt::format("e{}", argc); // stand-in for dynamic data

    std::cout << "looking for " << fieldName << '\n';
    std::cout << setField(a, fieldName) << '\n'; // OK -- works but only for static field mappings
    // std::cout << setField2(a, fieldName) << '\n'; // does non work because of non-constexpr filter criteria
    std::cout << setField3(a, fieldName) << '\n'; // OK -- but loops through all member fields (ie. no early return)
}

The setField(..) function works well for field names (literals) that are known at compile-time but for our application, the fieldName is typically only known after having read some (dynamic) buffer.

The setField2(...) syntax using find_one(...) variant doesn't work for obvious reasons because the filter (presently) requires a constexpr matching condition which isn't possible in our case (ie. fieldNameand type are both dynamic run-time information).

The solution in the binding example seems to tackle this problem the other way round: first collect all dynamic information (from an xml-file) in a dynamic structure and then looping through all class fields and matching those with the correct filter parameter. In our case, we thought of retaining as much as possible of the constexpr'-ness of the class field information because -- preserving thetype` information is very hard (and you solved this very well) -- and the idea was that looping through a constexpr map/info tree and filling the dynamic information immediately is much faster (i.e. doing the memory copy only once to its final destination) rather than allocating/storing all data in a temporary data structure and then to match the info to the constexpr field structure (i.e. doing the memory copy twice).

The setField3(...) function seems to be a viable workaround (solution?) under these conditions but the setField3(..) still feels a bit clumsy. Is there possibly a better/smarter option than this? constexpr field/method iterators? Is looping through all member fields for classes with ~20-30 (possibly nested) expensive?

Also -- possibly related to for_each and run-time conditions: are the static_asserts for example here and here really necessary and/or possibly too restrictive? What are they protecting against? N.B. I had some code which -- by by-passing these -- still seems to work fine. The issue seems to be primarily related to operator<< overloading and std::ostream within a for_each loop.

Any help, info, or suggestions would be much appreciated!

N.B. we are counting on that gcc and clang will soon also provide C++20 constexpr compile-time implementations of 'std::string' similarly to the latest MSVC -- albeit this won't change much for the above case).

veselink1 commented 3 years ago

I see why you would prefer to keep as much of the type information as possible. I believe all three implementations are slightly misleading, in that while the loop expansion is done at compile-time, the string comparison is of course not. You're mentioning your use case includes types with 20-30 fields. The for_each loop used in setField3 would get unrolled into 20-30 string comparisons, which can be suboptimal. Both in terms of compiled code and in terms of execution speed.

In fact, even if the std::string constexpr optimisation you've linked to is applied, setField3 would run in the order of O(N*K), (N = number of fields, K = length of fieldName). There is nothing in refl-cpp that can be directly used to help with that. In general, the approach taken with the library is to provide just the bare minimum.


In your case, I believe using a suitable data structure will lead to an improvement. One possible optimisation that comes to mind is to use a map, mapping from string names to the field index, followed by an index-based match. This should be O(N+K). See here for a basic example. This compiles into a sequence of CMP, JE (which could be turned into a jump table when the compiler sees fit -- this is speculation). You might be able to optimise this further if you map the member name to an index beforehand.

If you are matching based on type as well, I suggest taking a look at the implementation of runtime::invoke. It further narrows the search space by making sure the member type is right.


The static_asserts verify that the REFL_AUTO/REFL_TYPE macro has been used and that there is metadata generated for that type. Without them, the type metadata will be empty. Not sure what that problem with that code you're mentioning was, but please consider opening another issue.

RalphSteinhagen commented 3 years ago

@veselink1 thanks for your quick reply, input, and notably examples! These are -- as always -- quite usefull! I like the idea using an unorderd map between field names and field index. We basically already use a hash-function to minimise the need for string comparisons for a (possibly) similar reason.

[..] would get unrolled into 20-30 string comparisons [..]

That was my understanding as well. While not all branches would match during run-time, the compile-time analysis must assume that potentially all if conditions could be true. The loop is actually supposed to do a recursion to dive into nested structs, ie. searching for a field that is is a 'class', is 'reflectable', not an 'std::' container, and has a matching field name and then call itself to deal with the regular other member fields. I defined a constexpr C++20 'ReflectableClass' concept for the first three conditions -- since they can also be evaluated during compile time -- but the dynamic field name comparison lets this fall back to a run-time comparison also for the fields that are not covered by the concepts. The pseudo-code is something like:

  template<ReflectableClass T>
  void deserialise(Buffer& buffer, T &value, const uint8_t  hierarchyDepth = 0) {
    while (buffer.position() < buffer.size()) {
       // read wire-data from buffer, returns typeID (uint8_t), fieldName hash (uint8_t) + string (string or string_view), and either 
       //  * 'value data' or 
       //  * marker info that a block with nested starts (followed by further fields and marked with an 'end' marker)
       for_each(refl::reflect<T>().members, [&fieldName, &buffer, &value](auto member) {
            if (member.name.str() == fieldName) {
                using MemberType   = std::remove_reference_t<decltype(member(value)>;
                if (!isReflectableClass<MemberType>()) { // normal primitive types: uint8_t, ... float, double, std::string, ...
                   member(value) = buffer.get<MemberType>();
                } else { // nested class
                   deserialise(buffer, member(value), hierarchyDepth+1); // <- fails because during compile-time 
                  // since it also tries to instantiate the template for non reflectable types like 'uint8_t' which the ReflectableClass concept does not support
                }
            }
        });
    }
  }

Where this fails is -- while unrolling to 20-30 if comparisons -- that only a few fields match the 'reflectable nested class' criteria and name, but it tries to check this for all other fields that are not further reflectable. Looping through all 20-30 fields once isn't that of a problem because there will be at least one match and action for each field -- either transferring data or recursively diving into the struct.

Basically, it seems to be related to the mixed if statement with one condition being constexpr (and requiring 'reflectable') and the other being only evaluated during run-time. :thinking:

I also dived into your runtime::invoke implementation and also got the thought of first mapping all the types (once) to a tuple-to-array of fields that only match the reflectable criteria and then to loop only through that list for matching field names.

Any thoughts on this?

Not sure what that problem with that code you're mentioning was, but please consider opening another issue.

I added initially some debugging marks/printouts and the template instantiation for the fmt::format failed for types in branches that were supposed to ensure that they are entered only for printable (primitive) types and were also invoked for the nested structs. I guess that this is also related to the mixed evaluation of constexpr and run-time conditions. Removing thestatic_assert` seem to bypass this specific issue but is more symptomatic than the actual cause.

In any case, need to find and spend more time on this ... for the time being, I am engulfed in admin issues ...

@veselink1 thanks again for your quick help! Much appreciated! :+1:

N.B. you may close this issue if you haven't any further hints/suggestions...

RalphSteinhagen commented 3 years ago

@veselink1 found a practical (ie. good-enough) solution which combines the example you posted earlier and if constexpr (...) that narrows the if/else branches during compile-time to only the nested/reflectable classes:

int32_t searchIndex = -1;
try {
    searchIndex = static_cast<uint64_t>(findMemberIndex<T>(fieldName));
} catch (std::out_of_range &e) {
    // error handling for missing/extra unknown fields etc
}

if (intDataType == yas::getDataTypeId<START_MARKER>()) {
    // reached start of sub-structure -> dive into nested class
    for_each(refl::reflect<T>().members, [&buffer, &value, &depth, searchIndex](auto member, int32_t index) {
        using MemberType   = std::remove_reference_t<decltype(member(value))>;
        if constexpr (isReflectableClass<MemberType>()) { // should be evaluated during compile-time
            if (index == searchIndex) {
               deserialise(buffer, member(value), depth + 1);
            }
        }
    });
    buffer.position() = dataEndPosition;
    continue;
}

// handling copying data into member fields with '!isReflectableClass<MemberType>()' logic

At least the if constrexpr pretty much solved the issues I mentioned above. Haven't done any in-depth performance evaluation but the code is short... which bodes well. Thanks again for your help and hints!