Open danburkert opened 4 years ago
see also https://github.com/gnzlbg/bitintr for safe and cross platform wrappers over the intrinsics
So I did some quick and dirty prototyping with varint-simd v0.3.0, and here's what I found:
This is probably because the only encode/decode function is for single u64's, which is currently a weak point for varint-simd (it's not that much faster than other implementations when decoding/encoding tiny u64's).
I suspect there will need to be some larger-scale refactoring to take full advantage of varint-simd. For example, protobuf tags are up to 32 bits long, so a lot of cycles can be saved when encoding/decoding those.
My library also just added support for quickly decoding two, four, and eight adjacent varints in parallel (subject to size limitations), with some really good throughput figures - most of the time, protobufs will be a 32 bit tag followed by a 32 bit number or length, and decode requests can be shrunk based on how large the data field is in the .proto file. So there's likely a lot more gains to be had.
https://www.reddit.com/r/rust/comments/f36j05/comment/fhhwqp9