getml / reflect-cpp

A C++20 library for fast serialization, deserialization and validation using reflection. Supports JSON, BSON, CBOR, flexbuffers, msgpack, TOML, XML, YAML / msgpack.org[C++20]
https://getml.github.io/reflect-cpp/
MIT License
901 stars 76 forks source link

fix: incorrect json schema for size validated fields #146

Closed Knyukua closed 1 month ago

Knyukua commented 1 month ago

A fix for the incorrect JSON schema being generated for size validated fields.

At first I was just tinkering around with the library and testing its features, and one thing I was interested in is automatic JSON schema generation. I wanted to be sure my strings would be X characters long, and the library had the support for that, but JSON schema generated for such fields is obviously incorrect.

So for the struct:

struct TestStruct
{
    template<typename T, auto Min, auto Max>
    using SizeValidated = rfl::Validator<T, rfl::Size<rfl::Minimum<Min>>, rfl::Size<rfl::Maximum<Max>>>;

    SizeValidated<std::string, 1, 2> a;
};

generated schema is:

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$ref": "#/definitions/TestStruct",
    "definitions": {
        "TestStruct": {
            "type": "object",
            "properties": {
                "a": {
                    "allOf": [
                        {
                            "minimum": 1,
                            "type": "number"
                        },
                        {
                            "maximum": 2,
                            "type": "number"
                        }
                    ]
                }
            ],
            "required": [
                "a"
            ]
        }
    }
}

and "a" definitely should not be a "number"

With my fix generated schema is:

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$ref": "#/definitions/TestStruct",
    "definitions": {
        "TestStruct": {
            "type": "object",
            "properties": {
                "a": {
                    "allOf": [
                        {
                            "type": "string",
                            "minLength": 1
                        },
                        {
                            "type": "string",
                            "maxLength": 2
                        }
                    ]
                }
            ],
            "required": [
                "a"
            ]
        }
    }
}

My fix is not the prettiest and needs some work (especially with naming), and I hope you could help me with that. But it works! I couldn't find any more edge cases (e.g. using AnyOf/AllOf inside the Size), so please point out the ones I missed. I'm also attaching a JSON schema generated for

struct TestStruct
{
    template<typename T, auto Min, auto Max>
    using SizeValidated = rfl::Validator<T, rfl::Size<rfl::Minimum<Min>>, rfl::Size<rfl::Maximum<Max>>>;

    SizeValidated<std::string, 1, 2> a;
    SizeValidated<std::vector<int>, 1, 2> s;
    SizeValidated<std::set<int>, 1, 2> d;
    SizeValidated<std::map<int, int>, 1, 2> f;
    rfl::Validator<std::string, rfl::Size<rfl::AnyOf<rfl::EqualTo<4>, rfl::EqualTo<6>>>> g;
    rfl::Validator<std::string, rfl::AnyOf<rfl::Size<rfl::EqualTo<4>>, rfl::Size<rfl::EqualTo<6>>>> h;
    rfl::Validator<std::string, rfl::Size<rfl::EqualTo<6>>> j;
};

schema.json

Knyukua commented 1 month ago

I also found that JSON schema generated for maps doesn't match the value written by the rfl::json::write:

"a": {
  "type": "array",
  "items": {
      "type": "array",
      "prefixItems": [
          {
              "type": "integer"
          },
          {
              "type": "integer"
          }
      ],
      "items": false
  }
}

so the expected value is something like "a": [[1, 2]], but rfl::json::write outputs

"a": {
  "1": 2
}
liuzicheng1987 commented 1 month ago

Hi @Knyukua, thanks for your contribution. This looks great and I will definitely merge.

Could you add a test or modify an existing one that checks the desired behaviour?

Knyukua commented 1 month ago

Sure I'll add a test for that

Knyukua commented 1 month ago

Added a simple test. Can we somehow implement to_schema for class fields? I think that would be useful for testing so if test fails we could know where exactly it failed, because right now it is just a very long string which you can't really read and find a difference between expected and actual.

liuzicheng1987 commented 1 month ago

@Knyukua , what to you mean by "class fields"?

Knyukua commented 1 month ago

@liuzicheng1987 By that I mean something like

struct TestStruct
{
    template<typename T, auto Min, auto Max>
    using SizeValidated = rfl::Validator<T, rfl::Size<rfl::Minimum<Min>>, rfl::Size<rfl::Maximum<Max>>>;

    SizeValidated<std::vector<int>, 3, 5> a;
};

rfl::json::to_schema(&TestStruct::a, rfl::json::pretty);

would yield

"a": {
    "allOf": [
        {
            "minimum": 3,
            "type": "number"
        },
        {
            "maximum": 5,
            "type": "number"
        }
    ]
}

I don't know how appropriate that is, but it would be helpful for testing associated with JSON schemas, since these long unreadable strings are hard to find errors in. That's just a thought, nothing much.

liuzicheng1987 commented 1 month ago

@Knyukua , why don't you just to this?

struct TestStruct
{
    template<typename T, auto Min, auto Max>
    using SizeValidated = rfl::Validator<T, rfl::Size<rfl::Minimum<Min>>, rfl::Size<rfl::Maximum<Max>>>;

    SizeValidated<std::vector<int>, 3, 5> a;
};

rfl::json::to_schema<TestStruct::SizeValidated<std::vector<int>, 3, 5>>(rfl::json::pretty);

I am pretty sure that this would work...

Knyukua commented 1 month ago

Sadly, no

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "allOf": [
        {
            "minimum": 3,
            "type": "number"
        },
        {
            "maximum": 5,
            "type": "number"
        }
    ],
    "definitions": {}
}
Knyukua commented 1 month ago

Maybe that's just a bug in my fix anyway

Knyukua commented 1 month ago

Oh wait im using the wrong branch

Knyukua commented 1 month ago

Well, yeah, i'm dumb, it works. I'd like to rewrite the test with that in mind. Sorry for the inconvenience

liuzicheng1987 commented 1 month ago

@Knyukua , sure, no problem. Rewrite the test if you want to.

Knyukua commented 1 month ago

@liuzicheng1987 Also may I ask, why are tests being split in so many files? E.g. why is test_enum is 7 files and not 1 file 7 tests? And their naming is questionable, wouldn't it be better if we named tests with what is actually being tested? test_enum7 doesn't really tell me anything, but test_get_enumerator_array would

liuzicheng1987 commented 1 month ago

@Knyukua , I like to have one test per file, because it makes it easier to find things. However, I think we would probably be able to reduce the compile time if we bundled the tests a bit and that's probably a very strong argument for doing so. In the early days, we didn't have too many tests and it didn't matter as much, but now it does.

As far as the naming is concerned, I guess that's just laziness. I personally don't really care one way or another. If you want to rename them, feel free. I'll merge a separate PR related to that.

liuzicheng1987 commented 1 month ago

I am merging this. Thank you so much for your contribution!