NLnetLabs / simdzone

Fast and standards compliant DNS zone parser
BSD 3-Clause "New" or "Revised" License
64 stars 11 forks source link

Implement Ice Lake kernel #94

Open k0ekk0ek opened 1 year ago

k0ekk0ek commented 1 year ago

So far, we support only SSE4.1 and AVX2, but AVX-512 may greatly improve speed. An initial port of simd.h won't require much work and halves the number of operations for the scanner. I expect AVX-512 to improve parsing of certain data types, base16 sequences and base64 sequences, although we can worry about those at a later stage and start of just including AVX2 operations and go from there.

lemire commented 1 year ago

An initial port of simd.h won't require much work and halves the number of operations for the scanner.

In many instances, AVX-512 can be twice as fast on the same hardware. But if you just merely do a straight port, the likely outcome is that the performance won't be improved. It is not the wider registers that actually help most. E.g., Zen 4 still uses 256-bit operations internally and is competitive with Ice Lake. It is somewhat misleading to think of AVX-512 as just wider registers (though it is that). AVX-512 requires "from the ground up" design to really shine. Of course, it is not a daring research question: simdjson shines with AVX-512. And it is not super hard... but it is not a refactoring problem.

k0ekk0ek commented 1 year ago

As a first project I'm hoping to port the the scanner (or stage1 in simdjson terms) to get an initial kernel started. At least, I expect that part to be relatively straight forward. After that I hope to implement faster parsing of base16 sequences, hoping that compress will make a big difference there. Over the last week I added many RRs and a lot of them use hex encoding. e.g. EUI48 and EUI64 (or MAC addresses), which are encoded as xx-xx-xx-xx-xx-xx (this is relatively straightforward in SSE too), but also but also just plain DS records. I have some ideas for making that better. But, I'm not very experienced with AVX-512 yet, so this may prove harder than I expect :sweat_smile:

lemire commented 1 year ago

I recommend requiring VBMI2. That's what you get in Ice Lake and Zen 4.