getml / reflect-cpp

A C++20 library for fast serialization, deserialization and validation using reflection. Supports JSON, BSON, CBOR, flexbuffers, msgpack, TOML, XML, YAML / msgpack.org[C++20]
https://getml.github.io/reflect-cpp/
MIT License
901 stars 76 forks source link

Strange map json serialization with non string key type #14

Closed Pennywise007 closed 9 months ago

Pennywise007 commented 9 months ago

When I try to get a json from map<string, any> everything works as expected, for example:

struct TestStringKey {
    rfl::Field<"Field", std::map<std::string, std::string>> Field;
} testString = {std::map<std::string, std::string>{{"1", "first"}, {"2", "second"}}};

After conversion to json becomes: {"Field":{"1":"first","2":"second"}}

But when I try to do the same for map<int, any> I see some strange output, foe example(I love examples):

struct TestIntKey {
    rfl::Field<"Field", std::map<int, std::string>> Field;
} testInt =  {std::map<int, std::string>{{1, "first"}, {2, "second"}}};

After conversion to json becomes: {"Field":[[1,"first"],[2,"second"]]}

It became a list of ?unnamed? fields. This is weird, I am sure that one of the key features of json format is that we should be able to convert json to the same types in a different programming language. Let's check how the same serialization works in GoLang for example, I don't expect you to know it so let me explain a bit. C++ map<Key, Value> is equal to Go map[Key]Value syntax (actually unordered_map, but who cares). So I can write the same simple code on Go:

type TestIntKey struct {
    Field map[int]string
}
intKey := TestIntKey{map[int]string{
        1: "first",
        2: "second",
}}

res, _ := json.Marshal(&intKey)
fmt.Println(string(res))

The output will be: {"Field":{"1":"first","2":"second"}}

It just took a key and converted it to a string. I decided to check if I can deserialize your library output to this go struct:

text := `{"Field":[[1,"first"],[2,"second"]]}`
err := json.Unmarshal([]byte(text), &intKey)
if err != nil {
    fmt.Println(err.Error())
}

And I got an error json: cannot unmarshal array into Go struct field TestIntKey.Field of type map[int]string

For numeric types it is easy to fix, but what should happen when key is a bool type or event a struct? I saw that in some C# libraries it can serialize Dictionary to sth like:

[
  {
    "Key": {
      "Id": 69,
      "Name": "Gomer"
    },
    "Value": {
      "Id": 1488,
      "City": "Vavilon"
    }
  },
  {
    "Key": {
      "Id": 322,
      "Name": "Bart"
    },
    "Value": {
      "Id": 228,
      "City": "Florida"
    }
  }
]

But I also find it weird.

liuzicheng1987 commented 9 months ago

@Pennywise007 Very interesting issue, thank you!

I‘m actually familiar with Go, it’s a very cool language. 😊

So first of all, let me point out that the behavior is documented:

https://github.com/getml/reflect-cpp/blob/main/docs/standard_containers.md

As stated in the documentation and as you observe, map-like types for which the key is a string are treated as an object.

Otherwise they are treated like a normal container. And since the value_type of a map is a std::pair, this is how they are deserialized. So I think that in the context of C++, this makes sense.

I think in the context of Go, the behavior is appropriate, because Go keys cannot be structs, but C++ is way more flexible in that regard.

It would actually be fairly easy to fix this, but the question is, do we want to? How do languages other than Go approach this problem?

Pennywise007 commented 9 months ago

C# example with the most-known library Newtonsoft:

 TestIntKey intKey = new TestIntKey
 {
        Field = new Dictionary<int, string>
        {
            { 1, "first" },
            { 2, "second" }
        }
    };

  string json = JsonConvert.SerializeObject(intKey);

The output is similar with Go: {"Field":{"1":"first","2":"second"}}

For complex objects, you already saw the example.

liuzicheng1987 commented 9 months ago

Cool, very interesting. Thanks for your input.

Let me think about that. I am a bit concerned that if I follow through with that, then someone else is going to come along and open an issue like" Hey, why are you transforming my integers into strings without my permission?"

From a technical point of view, it would be super-easy to implement. Basically, here is how it would be done:

In https://github.com/getml/reflect-cpp/blob/main/include/rfl/parsing/Parser.hpp, there is a VectorParser (the naming is somewhat misleading, I should probably change it). This contains a constexpr called "treat_as_map":

static constexpr bool treat_as_map() {
  if constexpr (is_map_like_not_multimap<VecType>()) {
    if constexpr (internal::has_reflection_type_v<typename T::first_type>) {
      return std::is_same<
          std::decay_t<typename T::first_type::ReflectionType>,
          std::string>();
    } else {
      return false;
    }
  } else {
    return false;
  }
}

You would have to catch the case that T::first_type is a floating_point or integer and return true.

Then you would have to handle that particular case in the MapParser, which is in the same file. That's it. Could be done in 15 minutes.

I am kind of leaning towards doing this, because Rust's serde does it like that as well. I think there is increasing evidence that this is the most standard solution.

liuzicheng1987 commented 9 months ago

@Pennywise007 , I think your arguments were very good, so I have decided to implement your proposal.

You can check out the following tests:

https://github.com/getml/reflect-cpp/blob/main/tests/json/test_map.hpp

https://github.com/getml/reflect-cpp/blob/main/tests/json/test_map_with_key_validation.hpp

And the accompanying documentation:

https://github.com/getml/reflect-cpp/blob/main/docs/standard_containers.md

I think this is exactly what you were asking for, so I am closing the issue.