getml / reflect-cpp

A C++20 library for fast serialization, deserialization and validation using reflection. Supports JSON, BSON, CBOR, flexbuffers, msgpack, TOML, XML, YAML / msgpack.org[C++20]
https://getml.github.io/reflect-cpp/
MIT License
934 stars 78 forks source link

inconsistent treatment of unexpected data formats / feature request DefaultIfMissing #167

Open fuchsi-fuchs opened 1 month ago

fuchsi-fuchs commented 1 month ago

At the moment reflect-cpp is inconsistent in how strict it is with which data is accepted. Missing fields in the data are treated as an error, but superfluous fields are ignored silently.

There should be modes to either treat both missing and superfluous fields as error (strict) ; as well as to treat neither as an error (lenient). (and i would argue one of these modes should be the default instead of the current mix)

For my current application I want to parse data (that I get from another program) that might be missing some fields. I could of course change all of my structs to define all variables as rfl::Fields, but instead I would prefer to be able to parse this as an argument to the load function.

I have tried to write a Processor to this end to define a default value to every field as the documentation suggested might be possible

struct DefaultIfMissing {
template <class StructType>
    static auto process(auto&& _named_tuple) {
        return _named_tuple.transform([](auto field){
            return (field = rfl::default_value);
        });
    }
};

[...]
rfl::json::load<Map, DefaultIfMissing>(path).value();
[...]

this compiles - but results in a segmentation fault. (though now that I think about it the documentation about Fields only uses these default_values to write a json - can they even be used to default the values when reading a json that is missing these fields?)

And anyway I would prefer the field to be initialized as defined in the struct. So given

struct Map {
    int a{1};
    int b{2};
};

and a json {"a":123} I would want something like rfl::json::load<Map, rfl::DefaultIfMissing>(path).value(); to return a Map with a==123 && b==2 instead of setting b to a default constructed int.

Is this currently possible with this library? And what do you think about a strict and a lenient parsing mode?

liuzicheng1987 commented 1 month ago

Let me address this one by one:

1) The current approach of complaining about missing fields but ignoring extra fields is how similar libraries in other languages, like Pydantic, behave and I think that makes a lot of sense from a type theoretical point of view as well.

2) If you want the field to be optional, you could just use std::optional or std::shared_ptr. That would be the easiest to get what you want.

3) I agree that your proposed method of assigning the default value inside the struct is the most elegant and intuitive way of handling the problem, and it has been suggested in previous issues, but unfortunately it is not very easy to make this work in C++. So far, I haven't found a good solution.

4) I think a "strict mode", as you propose, in which additional fields lead to parsing errors instead of being silently ignored, is a great idea and would not be very hard to implement. If you are interested, I can guide you on how to do it and you could give it a shot yourself.

5) A "lenient mode" is a bit more tricky, because it assumes that all of the classes have default values, which they might not have.

liuzicheng1987 commented 1 week ago

@fuchsi-fuchs I have been thinking about this and here is what we could do:

struct DatabaseAccessData
{
    rfl::Default<std::string> login = "admin";
    rfl::Default<std::string> password = "admin";

    rfl::Default<std::string> ip = "127.0.0.1";
    rfl::Default<std::string> database = "test";
    rfl::Default<unsigned short> port = 3306;
};

Now, when you read in a JSON or YAML and the field is missing, it will just take the default value.

You can access fields like this:

my_struct.login(); // returns reference to std::string

It would also be possible to then write a preprocessor that would just fill ALL fields with their default values which is what you have attempted.

What are your thoughts?

liuzicheng1987 commented 5 days ago

@fuchsi-fuchs, here is the first draft and a simple test that shows how this works:

https://github.com/getml/reflect-cpp/blob/f/default_values/tests/json/test_default_if_missing.cpp

Let me know what your thoughts are.