Diggsey / ijson

More efficient alternative to `serde_json::Value` which saves memory by interning primitive values and using tagged pointers.
Apache License 2.0
127 stars 14 forks source link

add simd-json and json-rust #9

Closed gkorland closed 1 year ago

gkorland commented 2 years ago

@Diggsey I added two more libraries to the benchmark, but it seems like on simd there is some error.

Diggsey commented 2 years ago

Thanks for the PR! I'll try to look at this ASAP but feel free to bug me if I don't get around to it in the next few days.

Diggsey commented 2 years ago

I looked into this, and there are two reasons why it errors:

1) There is some lazily initialized static data from the ahash crate (simd_json -> halfbrown -> hashbrown -> ahash) which was being counted as a leak. I resolved this by forcing this to be initialized outside of the tracked region.

2) There is a soundness bug in simd_json's AlignedBuf implementation, which triggered mockalloc to raise an error: https://github.com/simd-lite/simd-json/issues/213

gkorland commented 2 years ago

@Diggsey it seems like the bug was fixed in simd_json

Diggsey commented 2 years ago

This is the fix for the mockalloc issue:

 fn test_simd_decode(data: &[u8]) -> (AllocInfo, AllocInfo) {
+    // `simd_json` uses `hashbrown` which uses `ahash` as the default hasher,
+    // which lazily initializes some static data. Force the initialization to
+    // happen early, or else it will be considered a memory leak.
+    let _ = simd_json::OwnedValue::Object(Default::default());

The new dependencies should also be dev dependencies.

I'd be open to a PR in future to make serde_json a dev dependency too - currently it's only used for its Error type.

gkorland commented 1 year ago

@Diggsey do you think we should move ahead with this PR?

Diggsey commented 1 year ago

Happy to merge if the errors are fixed and those dependencies are made dev-dependencies.

Diggsey commented 1 year ago

Closing due to inactivity. Feel free to re-open.