beached / daw_json_link

Fast, convenient JSON serialization and parsing in C++
https://beached.github.io/daw_json_link/
Boost Software License 1.0
452 stars 28 forks source link

How to declare recursive self referencing data structure? #421

Open andreasdamm-shure opened 4 months ago

andreasdamm-shure commented 4 months ago

Trying to declare the data contract for a variant type that can have itself as one of the alternatives.

struct Variant
{
   std::variant<int, bool, std::shared_ptr<Variant>> value;
};

A std::shared_ptr has to be used as the compiler complains about incomplete type otherwise.

Calling

auto payload = daw::json::to_json(Variant {1}, std::vector<std::uint8_t> {});

With the following contract all call to to_json does not compile (parse_to_t missing).

namespace daw::json {

template<>
struct json_data_contract<Variant>
{
   using type = json_type_alias<
        json_variant_no_name<std::variant<int, bool, std::shared_ptr<Variant>>,
        json_variant_type_list<
            json_number_no_name<int>, 
            json_bool_no_name<bool>, 
            json_class_null_no_name<Variant, JsonNullable::Nullable, std::shared_ptr<Variant>>>>>;

   static auto to_json_data(const Variant &value) { return value.value; }
};

}

Are these type of data structures supported?

(Using MSVC 19.37.32825)

beached commented 4 months ago

Do you have any example JSON for this? It seems like the shared_ptr in this case could only be an int/bool and because the mapping is an alias there is no class alternative at this point. I am not fully understanding

andreasdamm-shure commented 4 months ago

Probably not quite right using json_class_null_no_name, json_link_no_name might be the correct json link type.

Example JSON would be

[
  true,
  false,
  42,
  [
    1,
    2,
    3,
    [
      true,
      [
        1,
        false
      ]
    ]
  ]
]

The JSON schema I am trying to implement is as follows

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "value": {
      "oneOf": [
        {
          "type": "number"
        },
        {
          "type": "boolean"
        },
        {
          "type": "array",
          "items": {
            "$ref": "#/definitions/value"
          }
        }
      ]
    }
  },
  "allOf": [
    {
      "$ref": "#/definitions/value"
    }
  ]
}
beached commented 4 months ago

I think this needs to go into the raw mapping type. So something like https://jsonlink.godbolt.org/z/nnoT8ssqY

#include <daw/json/daw_json_link.h>

#include <cassert>
#include <string>
#include <utility>
#include <variant>
#include <vector>

struct Variant {
    std::variant<int, bool, std::vector<Variant>> value;
};

struct VariantCtor {
    Variant operator()(char const *ptr, std::size_t sz) const {
        auto value = daw::json::json_value(std::string_view(ptr, sz));
        return operator()(value);
    }

    Variant operator()(daw::json::json_value value) const {
        using namespace daw::json;
        switch (value.type()) {
            case JsonBaseParseTypes::Number:
                return Variant{from_json<int>(value)};
            case JsonBaseParseTypes::Bool:
                return Variant{from_json<bool>(value)};
            case JsonBaseParseTypes::Array: {
                auto res = std::vector<Variant>();
                for (auto jp : value) {
                    res.push_back(operator()(jp.value));
                }
                return Variant{std::move(res)};
            }
            default:
                std::abort();
        }
    }
};

namespace daw::json {
template <>
struct json_data_contract<Variant> {
    using type = json_type_alias<json_raw_no_name<
        std::variant<int, bool, std::vector<Variant>>, VariantCtor>>;

    static auto to_json_data(const Variant &value) { return value.value; }
};
}  // namespace daw::json

int main() {
    {
        constexpr std::string_view json_doc = "5";
        auto i = daw::json::from_json<Variant>(json_doc);
        assert(i.value.index() == 0);
    }
    {
        constexpr std::string_view json_doc = "false";
        auto b0 = daw::json::from_json<Variant>(json_doc);
        assert(b0.value.index() == 1);
    }
    {
        constexpr std::string_view json_doc = "true";
        auto b1 = daw::json::from_json<Variant>(json_doc);
        assert(b1.value.index() == 1);
    }
    {
        constexpr std::string_view json_doc =
            "[1, true, false, [1, false, []]]";
        auto ary = daw::json::from_json<Variant>(json_doc);
        assert(ary.value.index() == 2);
    }
}
beached commented 4 months ago

Serialization of that currently needs more manual than I would like though.

andreasdamm-shure commented 4 months ago

Does the constructor and json_value approach cover serialization?

beached commented 4 months ago

Unfortunately, I don't see an easy path here. recursive DS's isn't something that has been considered and should get better support in the future.

beached commented 4 months ago

This will have to be a separate thing for recursive DS's as it's unbounded at compile time. The depth/length isn't known until the data is parsed.

andreasdamm-shure commented 4 months ago

I have been exploring the use of json_custom_no_name with an option of options::json_custom_opt(options::JsonCustomTypes::Literal which allows for serialization. Deserialization doesn't work that way as only primitive types are allowed. Using options::JsonCustomTypes::Any may be a way forward but serialzation puts double quotes around output.

Thinking about using the alternative mappings feature to use json_value for parsing and json_custom for serialization.

andreasdamm-shure commented 4 months ago

An advantage of using the conversion class in the json_custom approach is that when splitting conversion operator into declaration before data contracts are declared and definition into after they have been declared is that the definition can now use the established contract to call to_json on data like itself.

Maybe that might be an approach for recursive structures in general -- somehow using the definition/declaration split around the contract to be able to refer back to itself.

beached commented 4 months ago

I think a generalized recursive approach will need an explicit stack for when the depth gets deep there isn't a stack overflow. There have been attacks that hit JSON libraries like this in the past. That hasn't been needed so far as everything is bounded by the type systems limits.