USCiLab / cereal

A C++11 library for serialization
BSD 3-Clause "New" or "Revised" License
4.23k stars 761 forks source link

JSON serialization and deserialization of char type #310

Open felixzng opened 8 years ago

felixzng commented 8 years ago

Is it possible to serialize char as a literal instead of ASCII code. Also is it possible to deserialize it from literal instead of ASCII. Consider code below. This is the output.

{ "Foo": { "c": 65 } B terminate called after throwing an instance of 'cereal::RapidJSONException' what(): rapidjson internal assertion failure: IsObject()


#include <fstream>
#include <iostream>
#include <cereal/archives/json.hpp>

struct Foo
{
    char        c = 'A';

    template <class Archive>
    void serialize( Archive & ar )    { ar( CEREAL_NVP(c) ); }
};

int
main(int argc, char ** argv)
{
    Foo foo;

    // create output archive
       // this will print c as 65
    cereal::JSONOutputArchive output_ar(std::cout);
    output_ar( cereal::make_nvp("Foo", foo) );
    std::cout << std::endl;

    std::stringstream ss;

        // this will work
    std::string jstr1 = "{ \"Foo\": { \"c\": 66 } }";
    ss << jstr1;
    cereal::JSONInputArchive input_ar1(ss);
    input_ar1( cereal::make_nvp("Foo", foo) );
    std::cout << foo.c << std::endl;

        // this will throw an exception
    std::string jstr2 = "{ \"Foo\": { \"c\": 'B'} }";
    ss << jstr2;
    cereal::JSONInputArchive input_ar2(ss);
    input_ar2( cereal::make_nvp("Foo", foo) );
    std::cout << foo.c << std::endl;
}
AzothAmmo commented 8 years ago

Looks like we aren't catching chars explicitly so they get serialized as numbers. The desired behavior (as it is for XML) would be ASCII characters for char and numbers for int8_t and uint8_t. It looks like we need to convert to a string for RapidJSON or modify RapidJSON to support directly reading/writing chars.

m7thon commented 8 years ago

What exactly is wrong with serializing a char as a number? Also, is it really safe to serialize a char as a string? What about '\0', or any non-ASCII char?

I'd argue that the desired behavior is primarily that deserialization works and gives back the same value.

felixzng commented 8 years ago

Let's consider this use case.

enum Tp_e { Tp_A ='A', Tp_B ='B', Tp_C ='C' };

struct Foo { Tp_e tp = Tp_B;

template <class Archive>
void serialize( Archive & ar )    
{ 
    ar( CEREAL_NVP(tp) ); 
}

};

Struct Foo1 is used as an application configuration. Is is saved in json file which is editable by a user. As things stand user will see the following. { "Foo": { "tp": 66 } }

With proposed change it should look like { "Foo": { "tp": 'B' } }

I would argue than being able to modify configuration using char literal instead of ASCII codes is much more elegant and user friendly. Please note that I did not suggest serializing char to string. I hope we all agree that 'A' is not the same as "A".

reuk commented 8 years ago

JSON doesn't have char literals, so your options are either ints or strings in the JSON output (if you want it to be valid JSON anyway). I'd argue that ints are probably the more sane behaviour for the default case.

In your example, the issue is not actually 'serializing chars', it's 'serializing enums in a human-readable/editable way', which is a significant difference. I'd recommend writing a load_minimal save_minimal pair for your enum, which would also give you the option of serializing to something a bit more descriptive than a single character.

felixzng commented 8 years ago

You are correct JSON does not have a char literal in the spec. I was hoping to add this feature as an enhancement to Cereal/JSON encoder/decoder since XML encoder/decoder does support it. I'd totally agree that default behaviour should be int. It would great though to have an option to encode/decode to/from literal. Thanks for suggesting the use of minimal, it only works on single value though.