beached / daw_json_link

Fast, convenient JSON serialization and parsing in C++
https://beached.github.io/daw_json_link/
Boost Software License 1.0
452 stars 28 forks source link

Support for streams of JSON data and incremental parsing? #426

Open KKhanhH opened 3 months ago

KKhanhH commented 3 months ago

Hi,

I'm researching switching from rapidJSON to DAW JSON link for my project in order to avoid allocation. One of the features that rapidJSON supports is a custom stream that can be used for its parser. My project uses a large JSON file, containing a few objects with large nested arrays, which is then compressed. I would like to avoid having to allocate a large string buffer for the entire decompressed JSON file and instead use a small buffer for a part of the decompressed data and then incrementally deserialize the nested arrays and outer arrays into their own classes. Is something like this possible with this library?

Here is an example of what I'm looking for

Sample JSON

{
    "class1": [
        {
            //regular old class
        },
        {
            //regular old class
        }
    ],
    "obj1": {
        "header": ["name","value","nestedClass"],
        "nestedClassHeader": ["name","value","value2"],
        "data": [
            [
                "str", 1.324, [["abc",1,1.0],["cdf",2,2.5]] // many more arrays
            ],
            ["str2", 1.234, [["abc",1,1.0],["cdf",2,2.5]]], //many more arrays

        ]
    },
    "obj2": {
        //same structure as above
    }
}

Obj 1 and 2 would contain name, value and then a vector of "nestedClasses"

beached commented 3 months ago

does something like https://github.com/beached/daw_json_link/blob/release/docs/cookbook/unknown_types_and_raw_parsing.md handle that for you? then later use something like a json_array_iterator over that member?

KKhanhH commented 3 months ago

From the example, I can't tell if json_raw accepts an incomplete json value. Would it be able recognize obj1's name and the other members "header" and "nestedClassHeader" if the string only contained up to say 3 array elements of ob1["data"] . I am trying to avoid storing the entire array all at once since the array nested inside of "data"'s elements could contain several thousand arrays, on top of data possibly containing several thousand elements.

beached commented 3 months ago

It defaults to a json_value which would hold the character positions of that member(like a string_view) and let you parse it later, it just skips it for now.

KKhanhH commented 3 months ago

Ah okay, that sounds like it will need to be a seekable stream then. Are there any examples for constructing a json_value from a custom stream class rather than a contiguous string/string_view?