stephenberry / glaze

Extremely fast, in memory, JSON and interface library for modern C++
MIT License
1.21k stars 120 forks source link

Parsing glz::json_t using glz::read_json #1164

Closed pblxptr closed 1 month ago

pblxptr commented 3 months ago

Hi, looking at the following code:

#include <glaze/glaze.hpp>

#include <cassert>

struct Widget
{
    std::string currency;
    float value;
};

int main()
{
    glz::json_t json = {
        {"pi", 3.141},
        {"happy", true},
        {"name", "Stephen"},
        {"nothing", nullptr},
        {"answer", {{"everything", 42.0}}},
        {"list", {1.0, 0.0, 2.0}},
        {"widget", {
           {"currency", "USD"},
           {"value", 42.99}
       }}
    };

    auto widget = Widget{};
    auto buf = json["widget"];
    [[maybe_unused]] auto parseError = glz::read_json<Widget>(widget, buf);
    assert(!parseError);
    assert(widget.currency == "USD");

    return 0;
}

the code does not compile however I'm wondering why is that? Isn't it reasonable to allow deserializing structs from generic json using existing api and reflection? If not then why?

I managed to do the following workaround but it adds some overheads, namely writing json_t to string and then back to Widget

#include <glaze/glaze.hpp>

#include <cassert>

struct Widget
{
    std::string currency;
    float value;
};

int main()
{
    glz::json_t json = {
        {"pi", 3.141},
        {"happy", true},
        {"name", "Stephen"},
        {"nothing", nullptr},
        {"answer", {{"everything", 42.0}}},
        {"list", {1.0, 0.0, 2.0}},
        {"widget", {
           {"currency", "USD"},
           {"value", 42.99}
       }}
    };

    auto widget = Widget{};
    auto widgetJson = json["widget"];
    auto buf = std::string{};
    glz::write_json(widgetJson, buf);

    [[maybe_unused]] auto parseError = glz::read_json<Widget>(widget, buf);
    assert(!parseError);
    assert(widget.currency == "USD");

    return 0;
}
stephenberry commented 3 months ago

I really like this request. However, currently glz::read_json requires a buffer or characters as the second argument. Hence, your more expensive solution is necessary.

I do think this direct conversion would be extremely useful and not too difficult to implement. I'll definitely look into implementing this in the future.

Thanks for the suggestion!

misyltoad commented 3 months ago

I also had a desire for this, and had the same workaround. It would be very appreciated =D

stephenberry commented 1 month ago

I'll be merging in a top-level solution to this soon: #1337. This adds support for using a glz::json_t value as the source to read into a concrete C++ value.

Currently this uses the same costly workaround described in the comments. It serializes to JSON and then parses this. So, this merge will just simplify code that needs this translation and provide the top level API that can be optimized in the future.

To make a fully optimized version of this translation would require thousands of lines of code and a massive number of tests. Because for every C++ type we would have to define the translation from glz::json_t just like we define it for a string like buffer. Even more work would need to be done to support wrappers as well.

So, this intermediate buffer approach is far simpler to implement, albeit it comes at a heavy translation cost.

One potential optimization is to use the BEVE binary format as the intermediary translation format, rather than JSON. Because this reads/writes much faster and uses less memory. This would be invisible to the user and be a simple change.

I'd appreciate any thoughts on using a binary format for the translation from glz::json_t to another concrete type. BEVE is also much faster when it comes to handling variants, which glz::json_t nests variants.

stephenberry commented 1 month ago

Initial support has been merged in #1337. Currently the translation is made through JSON for the least surprise and best stability, but future optimizations will be made to make this translation much faster. I would prefer a direct translation, but this will be a significant effort. For now, this greatly simplifies code for users and provides an interface that can be optimized in the future.