The design of the SVE vtbl allows us to simply bitwise OR successive lookups for a multi-lookup table because, unlike x86 shuffles, it's guaranteed that high values outside of the active 32-bit slice return zero.
This has no impact on compression, but gives a small (0.5%) improvement to decompression performance.
The design of the SVE vtbl allows us to simply bitwise OR successive lookups for a multi-lookup table because, unlike x86 shuffles, it's guaranteed that high values outside of the active 32-bit slice return zero.
This has no impact on compression, but gives a small (0.5%) improvement to decompression performance.