mongodb / bson-rust

Encoding and decoding support for BSON in Rust
MIT License
400 stars 132 forks source link

use simd to optimize uft8 validation. #437

Closed Liyixin95 closed 10 months ago

Liyixin95 commented 11 months ago

benchmark env

benchmark suit

benchmark suit

id: 10, 12, 14

my computer

windows11 13th Gen Intel(R) Core(TM) i7-13700H 32.0 GB 3200 MHz

benchmark result

original

Running tests nonverbosely...

Running Flat BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Flat BSON Decoding -- Score: 367.726 MB/s, Median Iteration Time: 0.205s

Running Deep BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Deep BSON Decoding -- Score: 67.747 MB/s, Median Iteration Time: 0.290s

Running Full BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Full BSON Decoding -- Score: 274.048 MB/s, Median Iteration Time: 0.209s

BSONBench Score = 236.507 MB/s

bstr

Running tests nonverbosely...

Running Flat BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Flat BSON Decoding -- Score: 394.848 MB/s, Median Iteration Time: 0.191s

Running Deep BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Deep BSON Decoding -- Score: 73.264 MB/s, Median Iteration Time: 0.268s

Running Full BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Full BSON Decoding -- Score: 277.837 MB/s, Median Iteration Time: 0.206s

BSONBench Score = 248.650 MB/s

bstr need to be pinned to <1.7.0 to satisfy the msrv requirment.

simdutf8

Running tests nonverbosely...

Running Flat BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Flat BSON Decoding -- Score: 404.186 MB/s, Median Iteration Time: 0.186s

Running Deep BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Deep BSON Decoding -- Score: 74.206 MB/s, Median Iteration Time: 0.265s

Running Full BSON Decoding...
  [00:01:01] [########################################] 100/100 (0s)
TEST: Full BSON Decoding -- Score: 281.251 MB/s, Median Iteration Time: 0.204s

BSONBench Score = 253.214 MB/s

simdutf8 does not support from_utf8_lossy, so only lossless transformation was optimized.

conclusion

bstr looks like slower then simdutf8 in my computer, may be because they do not support avx2. So, I tend to use simdutf8 to optimize utf8 validation. But the final decision is depending on your teams.

isabelatkinson commented 11 months ago

Hi @Liyixin95, thanks for running these benchmarks! Would you be interested in making a PR to switch over to one of the libraries you profiled? We don't currently have the bandwidth to do this but would be happy to review it. We'd also want to run these benchmarks on our machines to ensure that these performance improvements can be reproduced. Otherwise I can file a ticket to consider doing this in the future.

github-actions[bot] commented 10 months ago

There has not been any recent activity on this ticket, so we are marking it as stale. If we do not hear anything further from you, this issue will be automatically closed in one week.

github-actions[bot] commented 10 months ago

There has not been any recent activity on this ticket, so we are closing it. Thanks for reaching out and please feel free to file a new issue if you have further questions.