stephenberry / glaze

Extremely fast, in memory, JSON and interface library for modern C++
MIT License
1.09k stars 111 forks source link

field-based parse bifurcation #1045

Open shiretu opened 3 months ago

shiretu commented 3 months ago

First of all, kudos for this outstanding library! Hands down, most comfortable library, and also I have no reason to disbelieve the benchmarks.

I have started using it and unfortunately, I have to deal with not-so-well-designed APIs around there. Consider the following JSONs which are coming one after another (as single messages) over the same logical communication channel, so I can not use out of band information to discerne which is what:

{
    "type": "address",
    "streetName": "5th Avenue",
    "buildingNumber": 25
}
{
    "name": "John",
    "age": 25,
    "type": "person"
}

We can now create a variant out of those correspondent C++ structures, and also declare a tag via meta object and say which is which. So far so good. But the problem now is that when I receive the second JSON, the type is legitimately encoded all the way last in the JSON. So I spend a lot of time going over there, and then I will start to parse again from the beginning. That is not ideal, but I'm not sure what can be done about it.

Even if is the first field, we still parse it again to save it inside the newly spawned structure. What would be nice, in the ideal case when the type prop is the first, we can behave like we behave now, but after reading the type, store it in the just specialized variant, and continue parsing from where it was left. Like a parse bifurcation.

Here is a better example JSON:

{
    "type": "company",
    "sharesPrice": "50",
    "actives": [
        {
            "type": "car",
            "brand": "Mercedes"
        },
        {
            "type": "building",
            "levels": 35
        }
    ]
}
stephenberry commented 3 months ago

Thanks for the encouragement! And, thanks for explaining this issue in detail.

I recently opened the issue #1019, which concerns adding reader/writer structures for incremental parsing. I think supporting these would make what you want with variant handling straightforward. I'd prefer to use a general solution within the variant parsing code rather than writing something specific to variants.

I'll keep this issue alive, but it isn't a priority at the moment, so it might be a while. In some cases the new partial read support can be used, but I'd recommend using the std::variant approach, because it tends to be cleaner and can be optimized in the future.