as-com / varint-simd

Decoding and encoding gigabytes of LEB128 variable-length integers per second in Rust with SIMD
https://docs.rs/varint-simd
Apache License 2.0
109 stars 10 forks source link

Use core::simd? #6

Open Inconn opened 8 months ago

Inconn commented 8 months ago

Have you considered using the core::simd module for simd? I went thru the effort of porting the decode_two_unsafe function, and it seems to have the same performance for me.

Here's a godbolt link with a simplified implementation of it using both the core::simd and core::arch::x86_64 modules. The core::simd implementation actually has no unsafe code besides the transmute, although it does expect a [u8; 16] as input. It also compiles on other platforms, since the core::simd module is meant to be portable. I haven't tested it on anything other than x86_64, but it's supposed to act exactly the same.

I did make my port of the function on the actual library, so I could benchmark it, but the code is really messy, so I only sent the godbolt link for now. I'll probably put it in a branch of my fork, but again, it's really messy and just thrown together.

Inconn commented 8 months ago

here's the branch https://github.com/Inconn/varint-simd/tree/core-simd-warning-really-messy

as-com commented 8 months ago

I did consider using std::simd but it isn’t stable yet. It would be nice to get support for other architectures for “free” but at this scale every CPU cycle counts, so it would probably still be best to use architecture-specific intrinsics.