Closed liuq19 closed 1 month ago
There are two reasons to not support noavx2 mode:
noavx2
noavx
Get
avx
avx2
Solution:
SONIC_MODE=noavx2
Decode performance compare: avx is same as sse, 20% slower than avx2.
sse
➜ sonic2 git:(main) ✗ SONIC_MODE="noavx" go test -run=none -bench="BenchmarkDecoder_Binding_Sonic" -benchmem ./decoder enabled sse goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/decoder cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkDecoder_Binding_Sonic-8 29247 40689 ns/op 320.35 MB/s 9644 B/op 127 allocs/op BenchmarkDecoder_Binding_Sonic_Fast-8 36027 33836 ns/op 385.25 MB/s 6589 B/op 24 allocs/op PASS ok github.com/bytedance/sonic/decoder 3.224s ➜ sonic2 git:(main) ✗ SONIC_MODE="noavx2" go test -run=none -bench="BenchmarkDecoder_Binding_Sonic" -benchmem ./decoder enabled avx goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/decoder cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkDecoder_Binding_Sonic-8 30001 40402 ns/op 322.63 MB/s 9650 B/op 127 allocs/op BenchmarkDecoder_Binding_Sonic_Fast-8 34809 34243 ns/op 380.67 MB/s 6611 B/op 24 allocs/op PASS ok github.com/bytedance/sonic/decoder 3.221s ➜ sonic2 git:(main) ✗ go test -run=none -bench="BenchmarkDecoder_Binding_Sonic" -benchmem ./decoder enabled avx2 goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/decoder cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkDecoder_Binding_Sonic-8 34161 34069 ns/op 382.61 MB/s 9633 B/op 127 allocs/op BenchmarkDecoder_Binding_Sonic_Fast-8 42513 27970 ns/op 466.04 MB/s 6600 B/op 24 allocs/op PASS ok github.com/bytedance/sonic/decoder 3.058s
Encode performance: avx is same as sse, and same as avx2.
➜ sonic2 git:(main) ✗ go test -run=none -bench=BenchmarkEncoder_Binding_Sonic -benchmem ./encoder goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/encoder cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkEncoder_Binding_Sonic-8 114964 10105 ns/op 1289.99 MB/s 14185 B/op 4 allocs/op BenchmarkEncoder_Binding_Sonic_Fast-8 133300 8679 ns/op 1501.90 MB/s 9906 B/op 4 allocs/op PASS ok github.com/bytedance/sonic/encoder 2.552s ➜ sonic2 git:(main) ✗ SONIC_MODE="noavx2" go test -run=none -bench=BenchmarkEncoder_Binding_Sonic -benchmem ./encoder goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/encoder cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkEncoder_Binding_Sonic-8 111176 10876 ns/op 1198.53 MB/s 14204 B/op 4 allocs/op BenchmarkEncoder_Binding_Sonic_Fast-8 131544 8754 ns/op 1489.10 MB/s 9929 B/op 4 allocs/op PASS ok github.com/bytedance/sonic/encoder 2.600s ➜ sonic2 git:(main) ✗ SONIC_MODE="noavx" go test -run=none -bench=BenchmarkEncoder_Binding_Sonic -benchmem ./encoder goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/encoder cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkEncoder_Binding_Sonic-8 107476 11097 ns/op 1174.62 MB/s 14221 B/op 4 allocs/op BenchmarkEncoder_Binding_Sonic_Fast-8 135178 8671 ns/op 1503.37 MB/s 9903 B/op 4 allocs/op PASS ok github.com/bytedance/sonic/encoder 2.605s ➜ sonic2 git:(main) ✗
Get performance: avx is about 15% faster than sse, avx2 is about 40% faster than avx.
➜ sonic2 git:(main) ✗ go test -run=none -bench="BenchmarkGetOne_Sonic" -benchmem ./ast goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/ast cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkGetOne_Sonic-8 735288 1598 ns/op 8148.42 MB/s 24 B/op 1 allocs/op PASS ok github.com/bytedance/sonic/ast 1.207s ➜ sonic2 git:(main) ✗ SONIC_MODE="noavx2" go test -run=none -bench="BenchmarkGetOne_Sonic" -benchmem ./ast goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/ast cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkGetOne_Sonic-8 557578 2277 ns/op 5720.21 MB/s 24 B/op 1 allocs/op PASS ok github.com/bytedance/sonic/ast 1.306s ➜ sonic2 git:(main) ✗ SONIC_MODE="noavx" go test -run=none -bench="BenchmarkGetOne_Sonic" -benchmem ./ast goos: linux goarch: amd64 pkg: github.com/bytedance/sonic/ast cpu: Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz BenchmarkGetOne_Sonic-8 471970 2614 ns/op 4982.66 MB/s 24 B/op 1 allocs/op PASS ok github.com/bytedance/sonic/ast 1.275s
There are two reasons to not support
noavx2
mode:noavx2
andnoavx
is almost the same (besidesGet
api).avx
but withoutavx2
is unusualSolution:
SONIC_MODE=noavx2
, it will act asnoavx
modeDecode performance compare:
avx
is same assse
, 20% slower thanavx2
.Encode performance:
avx
is same assse
, and same asavx2
.Get performance:
avx
is about 15% faster thansse
,avx2
is about 40% faster thanavx
.