tidwall / gjson

Get JSON values quickly - JSON parser for Go
MIT License
14.31k stars 854 forks source link

Boost API performance by using SIMD-implemented text processors #366

Closed AsterDY closed 1 month ago

AsterDY commented 2 months ago

Hi @tidwall , I'm the maintainer of sonic (https://github.com/bytedance/sonic). Sonic implements a set of high-performance text processors with SIMD-based C assembly, which can greatly improve the speed of JSON-skipping and string-decoding. Here are some references to our algorithm:

Meanwhile, gjson provides a set of proprietary APIs that are popular among Golang developers, thus it is hard to migrate their dependencies from gjson to sonic. If these two are combined, it will benefit a vast number of Golang users. Are you willing to depend on sonic to boost gjson's performance? If so, I will submit a pull request to gjson. Here are the estimated benefits after introducing sonic:

goversion: 1.22.0
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz
                         │  master.out   │             target.out              │
                         │    sec/op     │   sec/op     vs base                │
GetComplexPath/small-32     3.000µ ±  6%   2.591µ ± 6%  -13.65% (p=0.000 n=10)
GetComplexPath/medium-32   24.855µ ±  5%   6.464µ ± 8%  -74.00% (p=0.000 n=10)
GetComplexPath/large-32    1309.1µ ±  3%   270.0µ ± 8%  -79.37% (p=0.000 n=10)
GetSimplePath/small-32      702.1n ± 11%   634.3n ± 5%   -9.66% (p=0.000 n=10)
GetSimplePath/medium-32     9.558µ ±  3%   1.744µ ± 4%  -81.76% (p=0.000 n=10)
GetSimplePath/Large-32     342.03µ ± 14%   37.83µ ± 5%  -88.94% (p=0.000 n=10)
geomean                     24.64µ         7.576µ       -69.26%

                         │  master.out  │             target.out              │
                         │     B/op     │    B/op     vs base                 │
GetComplexPath/small-32    104.0 ± 0%     104.0 ± 0%       ~ (p=1.000 n=10) ¹
GetComplexPath/medium-32   16.00 ± 0%     16.00 ± 0%       ~ (p=1.000 n=10) ¹
GetComplexPath/large-32    16.00 ± 0%     16.00 ± 0%       ~ (p=1.000 n=10) ¹
GetSimplePath/small-32     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
GetSimplePath/medium-32    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
GetSimplePath/Large-32     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                               ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                         │  master.out  │             target.out              │
                         │  allocs/op   │ allocs/op   vs base                 │
GetComplexPath/small-32    6.000 ± 0%     6.000 ± 0%       ~ (p=1.000 n=10) ¹
GetComplexPath/medium-32   2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
GetComplexPath/large-32    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
GetSimplePath/small-32     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
GetSimplePath/medium-32    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
GetSimplePath/Large-32     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                               ²               +0.00%                ²
¹ all samples are equal
tidwall commented 2 months ago

No thanks. Your library look interesting and fast, but I generally avoid adding dependencies to my libraries that I can't maintain. Also gjson is super stable right now and I would rather not deal with potential issues I have no control over. Finally, I ran a quick test and it appears that your library allocates memory when accessing data, which would be a regression in gjson.

AsterDY commented 2 months ago

Thanks, and I understand your concern. Please allow me to explain the last point "I ran a quick test and it appears that your library allocates memory when accessing data" first: I guess you just tested sonic.Get() (which is implemented by parsing-on-demands instead of repeating searching) on your data (whose scale is kind of inconsistent with the practical ones), but sonic.Get has different target users with yours, at present. Actually, this is how I intend to integrate sonic's algorithm into gjson, and here is how I benched. If you think it is not suitable to merge it, I will put it on sonic's repo.

tidwall commented 1 month ago

I see. I appreciate the additional context. Perhaps I’ll take a deeper look at your changes.

AsterDY commented 1 month ago

Any progress? @tidwall

tidwall commented 1 month ago

I don't have the free time to deeply analyze sonic and I'm not that that interested in adding dependencies to gjson.

That said, I took a cursory look at the sonic code, and from what I can see there are many instances of unsafe and runtime linkage using go:linkname. Often without any comments.

These are red flags that indicates to me that sonic may pose a potential future risk to end users.

I suggest you maintain your own fork.

AsterDY commented 1 month ago

I don't have the free time to deeply analyze sonic and I'm not that that interested in adding dependencies to gjson.

That said, I took a cursory look at the sonic code, and from what I can see there are many instances of unsafe and runtime linkage using go:linkname. Often without any comments.

These are red flags that indicates to me that sonic may pose a potential future risk to end users.

I suggest you maintain your own fork.

Sure.