getml / reflect-cpp

A C++20 library for fast serialization, deserialization and validation using reflection. Supports JSON, BSON, CBOR, flexbuffers, msgpack, TOML, XML, YAML / msgpack.org[C++20]
https://getml.github.io/reflect-cpp/
MIT License
990 stars 85 forks source link

Bug: number of fields in a struct is limited to 100; consider removing limitation? #154

Open ethindp opened 2 months ago

ethindp commented 2 months ago

I'm not certain if this is a limitation in the library itself or in compilers, but I do have a use case where I've exceeded this limit significantly and I don't really have a way of splitting the fields up since Idk what I'd even name the substructs. 😀 If this is a limitation at the compiler level, I believe it can be raised for most of them:

If this is just a library-imposed limit, may I ask why it is there?

ethindp commented 2 months ago

So, after some research, the way that bind_to_tuple is implemented shows that this is both a compiler and a library limitation:

However, C++17 and 20 offer fold expressions and std::index_sequence for situations like exactly this one. I'm not positive if the below exact code would work, but the macros could be elided entirely and the code could be refactored to something along the lines of the following:

// Helper struct to generate a tuple_view for a given type T with N fields.
template <std::size_t N>
struct tuple_view_helper {
  template <class T, std::size_t... Is>
  static constexpr auto tuple_view_impl(T& t, std::index_sequence<Is...>) {
    return std::tie(t.template get<Is>()...);
  }

  template <class T>
  static constexpr auto tuple_view(T& t) {
    static_assert(
        N > /* Limit we wish to impose here, if we want to */,
        "\n\nThis error occurs for one of two reasons:\n\n"
        "1) You have created a struct with more than <limit> fields, which is "
        "unsupported. Please split up your struct into several "
        "smaller structs and then use rfl::Flatten<...> to combine them. "
        "Refer "
        "to the documentation on rfl::Flatten<...> for details.\n\n"
        "2) You have added a custom constructor to your struct, which you "
        "shouldn't do either. Please refer to the sections on custom "
        "classes or custom parsers in the documentation "
        "for solutions to this problem.\n\n"
      );
    return tuple_view_impl(t, std::make_index_sequence<N>{});
  }
};

template <>
struct tuple_view_helper<0> {
  static constexpr auto tuple_view(auto&) { return std::tie(); }
};

template <class T>
constexpr auto tuple_view(T& t) {
  return tuple_view_helper<num_fields<T>>::tuple_view(t);
}

template <class T, typename F>
constexpr auto bind_to_tuple(T& _t, const F& _f) {
  auto view = tuple_view(_t);
  return [&]<std::size_t... Is>(std::index_sequence<Is...>) {
    return std::make_tuple(_f(std::get<Is>(view))...);
  }(std::make_index_sequence<std::tuple_size_v<decltype(view)>>{});
}

Note that I haven't tested this, but I (think) that something like this might work and might even make it possible to remove the limit entirely, assuming we can get it to work.

liuzicheng1987 commented 2 months ago

No, you can't use fold expressions or variadic templates here (our lives would be A LOT easier if we could do this, trust me...).

The problem is this line:

auto& [__VA_ARGS__] = t;

https://github.com/getml/reflect-cpp/blob/main/include/rfl/internal/bind_to_tuple.hpp

This is a structured binding and you can not use fold expressions or anything of that ilk inside structured bindings. There is a proposal on the matter, but as far as I know it has recently been rejected:

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1061r1.html

liuzicheng1987 commented 2 months ago

@ethindp , if you can find a way to decompose structs without using structured bindings, that would be quite revolutionary and I would certainly merge your PR. But I am not aware of any solution for this problem.

ethindp commented 2 months ago

@liuzicheng1987 I'm not skilled enough for eliminating the structured binding requirement, sadly. I'm just wondering if we can eliminate the macro. We could then possibly make it so that the compiler could figure out the structure for us so there would be (theoretically) no upper limit at all to how many items could be in a struct, other than template depth restrictions, which is solvable (for the most part (well, Idk about MSVC but...)). Right now we're invoking a macro to generate this stuff... I suppose in the short term we could use something similar to that Python code that was in the comment to just generate the macro code directly without a macro, which in theory should solve our problem as a temporary workaround. Then we just limit it to like 2048 or 4096 or something. Maybe soome other metaprogramming library could help us here but I don't know one off the top of my head (I'm no expert of C++ metaprogramming and template magic to begin with, so there's someone out there who probably would know better about this problem than me).

liuzicheng1987 commented 2 months ago

@ethindp , sure we could raise this to something like 2048, it's a simple fix in the Python program to generate the boilerplate code.

But why would getting rid of the Macro make any difference?

Also, do we need this? I don't see any genuine use case for going beyond 100 fields. I might be persuaded to raise it to something like 256, just to be sure. But who is ever going to have 2000 fields in a struct?

ethindp commented 2 months ago

@liuzicheng1987 Getting rid of the macro would sidestep compiler limits in regards to number of parameters that can be passed to a macro at any given invocation site is all.

As for why this is truly necessary, I don't know anyone who would ever have a struct with 2000 fields. So that may be a bit high. But 256 is probably a good limit. My use-case/requirements are probably a bit niche, but my only other option would be to either (1) manually write all the parsing logic for the struct, (2) use reflection to parse the struct, or (3) try to flatten the struct. Options 1 and 2 are overly cumbersome, and I'm not sure how to do option 3 without making the struct overly difficult to actually use and having to remember all the names of the substructs.

I could use an XSD but I don't have one. All I have is the original apps source code for parsing this particular XML document structure, and that's platform-specific. I suppose I could use an rfl::ExtraFields/rfl::Generic but... I'd like to avoid loss of type-safety. The ironic thing is that this is the only struct where this is an actual problem. I'm just uncertain of what other options I could try. If you have any suggestions I'm open to them though.

danielmohansahu commented 3 weeks ago

Seconding the request for a bump from 100 to 256.

liuzicheng1987 commented 3 weeks ago

@danielmohansahu OK, got it