[Enhancement] Support more JSON backends

syoyo commented 1 year ago

Describe the issue

tinygltf currently depends on nlohmann JSON or RapidJSON as a JSON backend. Although both nlohmann JSON and RapidJSON are stable, there are some issues(e.g. multithreading: https://github.com/syoyo/tinygltf/issues/241 ) it'd be better to have more JSON backends for security(immune to malcious glTF JSON), support various environment(e.g. WASM WASI, embedded platform) and more usecases(faster glTF JSON parsing with multithreading).

There are some candidates for JSON backends:

Write our own JSON parser.
JSON library in C(for portability)
sajson https://github.com/chadaustin/sajson branch exists(JSON Read only) https://github.com/syoyo/tinygltf/tree/sajson
simdjson https://github.com/simdjson/simdjson

To Reproduce

N/A

Expected behaviour

N/A

Screenshots

N/A

nigels-com commented 1 year ago

boost::json perhaps?

syoyo commented 1 year ago

👀

alexwennstrom commented 11 months ago

Could be really wise to have simdjson supported backend. https://github.com/spnda/fastgltf, claims to be faster but lacks some features. Looks like it uses simdjson dom to parse, so there could be further improvemens by using simdjson ondemand parser.

spnda commented 11 months ago

Could be really wise to have simdjson supported backend. https://github.com/spnda/fastgltf, claims to be faster but lacks some features. Looks like it uses simdjson dom to parse, so there could be further improvemens by using simdjson ondemand parser.

Which features does it lack apart from writing glTFs and loading images? I think I offer a lot more than tinygltf currently. I would be happy to implement anything you feel left out. There's a PR open for an API for writing glTFs and it's a conscious decision to not load images, as I feel like the user should decide how they want to load the images. fastgltf is specifically designed to do the absolute minimum if you don't ask for it. It's a bit more explicit, but you have much more control over the loading process.

Also, the simdjson DOM parser actually turns out to be faster in my case. I used to use the ondemand parser but it is actually slightly slower. Perhaps this is because I don't use the exception model, but I have not had enough time or resources to properly analyze where the slowdown comes from, apart from the fact that the stage1 processing (which indexes important sections of the file) takes a very long time to run, and is ultimately the deal breaker even though stage2 is extremely quick. If I remember correctly, this was in many cases mitigated by using a minimized JSON, but that doesn't come up a lot in real scenarios so I decided to ultimately go with the DOM backend as it's generally faster.

alexwennstrom commented 7 months ago

Could be really wise to have simdjson supported backend. https://github.com/spnda/fastgltf, claims to be faster but lacks some features. Looks like it uses simdjson dom to parse, so there could be further improvemens by using simdjson ondemand parser.

Which features does it lack apart from writing glTFs and loading images? I think I offer a lot more than tinygltf currently. I would be happy to implement anything you feel left out. There's a PR open for an API for writing glTFs and it's a conscious decision to not load images, as I feel like the user should decide how they want to load the images. fastgltf is specifically designed to do the absolute minimum if you don't ask for it. It's a bit more explicit, but you have much more control over the loading process.

Also, the simdjson DOM parser actually turns out to be faster in my case. I used to use the ondemand parser but it is actually slightly slower. Perhaps this is because I don't use the exception model, but I have not had enough time or resources to properly analyze where the slowdown comes from, apart from the fact that the stage1 processing (which indexes important sections of the file) takes a very long time to run, and is ultimately the deal breaker even though stage2 is extremely quick. If I remember correctly, this was in many cases mitigated by using a minimized JSON, but that doesn't come up a lot in real scenarios so I decided to ultimately go with the DOM backend as it's generally faster.

Sorry for not answering for long time, one major advantage with tinygltf is the ability to work with the gltf data, for example we are using gltf as intermediate format for merging or replacing gltf models/materials in the scene and writing the combined gltf data. Unfortunately fastgltf is lacking not sure about modifying asset data but at least the write gltf part (writing gltf may be trivial but still an effort).

spnda commented 7 months ago

Sorry for not answering for long time, one major advantage with tinygltf is the ability to work with the gltf data, for example we are using gltf as intermediate format for merging or replacing gltf models/materials in the scene and writing the combined gltf data. Unfortunately fastgltf is lacking not sure about modifying asset data but at least the write gltf part (writing gltf may be trivial but still an effort).

Both of those things are possible with fastgltf. 0.7.0 (released in February) added support for exporting glTFs and GLBs. Underneath the code is not as sophisticated, as I just build up the JSON with a normal std::string, but I fuzz the library with over 250 different real-world assets locally and it works with all of them.

I am considering adding functionality to abstract how the data within buffers can be written properly to adhere to the different quirks of glTF accessors, so currently you'll just have to know and respect the rules yourself.

Also, I have a bit of news about the ondemand functionality. Through profiling and testing I found a few issues in my code that boosted performance, and I also got up to 20% advantage on my M3 using ondemand. However, on Windows with x86 the performance still regresses. The branch is still open and I will profile more when I have more time to eventually pull in that branch. The spreadsheet linked in the performance chapter of the documentation has a table that compares both approaches directly.

syoyo commented 2 months ago

I made a minijson

https://github.com/syoyo/minijson

which is secure(fuzz tested), portable and supports Unicode escape identifier. Could be used as a default JSON parser for TinyGLTF in the next major release

syoyo / tinygltf

[Enhancement] Support more JSON backends #392