zwegner / faster-utf8-validator

A very fast library for validating UTF-8 using AVX2/SSE4 instructions
219 stars 10 forks source link

AVX512 Validation #1

Open Wunkolo opened 4 years ago

Wunkolo commented 4 years ago

This isn't so much an issue as much as it is a tip. But you mention not having access to an AVX512 machine:

This algorithm should map fairly nicely to AVX-512, and should in fact be a bit faster than 2x the speed of AVX2 since a few instructions can be saved. But I don't have an AVX-512 machine, so I haven't tried it yet.

I wanted to mention that the Intel Software Development Emulator can simulate AVX-512 instructions for any program you pass into it, and can allow you to verify an AVX-512 implementation.

I'm also willing to benchmark any AVX-512 implementations you have on my Skylake-X i9-7900X and on the upcoming CascadeLake-X processor that I will be getting soon.

zwegner commented 4 years ago

Cool, thanks! I knew about the SDE, but hadn't used it in a while. It apparently stopped working on macOS since the last time I used it... E: Unable to run task_for_pid() on the process Pin is attaching to.

And thanks for your offer, too! I started working on an AVX-512 implementation today. It looks like some of the instructions I had in mind (in particular vpermb) were not yet supported on Skylake-X. There's a rather annoying amount of fragmentation in the various AVX-512 extensions, along with Intel's "lake" fixation... I guess that's the price to pay for using all those sweet, sweet instructions. There is at least an AVX-512 version of vpshufb on Skylake-X that is almost equivalent, but needs an extra mask for the three error table lookups. I'll let you know when I get something testable. I might just end up renting some time on a cloud machine so I can work out any performance kinks iteratively.

travisdowns commented 4 years ago

Heh, I came here to write about SDE, which I use (but on Linux), but it seems someone beat me to it :).

It's weird that it doesn't work on Mac - I am quite sure there must be some workaround as this is an important product to Intel. Probably it's a security setting somewhere that needs to be tweaked. Maybe this one - but hopefully there's something easier because that seems ugly.

zwegner commented 4 years ago

It is weird, especially given that I was using it successfully 10 months ago after this thread about removing spaces in text with AVX-512/PEXT where you commented too... I recently updated to 10.14, but I think I was on OS 10.12 then, which shouldn't have worked according to that thread.

I definitely don't want to disable SIP, so I suppose I'll either rent some cloud time or fire up a VM...

travisdowns commented 4 years ago

I recently updated to 10.14, but I think I was on OS 10.12 then, which shouldn't have worked according to that thread.

I guess it is possible that Intel is jumping through the required loops to bypass these security things, but that the hoops changed between 10.12 and 10.14. Are you on the most recent SDE version?

or fire up a VM...

Yeah a Linux VM would probably work fine for SDE.

zwegner commented 4 years ago

Are you on the most recent SDE version?

Yep. And tried their workaround of running Pin's host_config utility...

Yeah a Linux VM would probably work fine for SDE.

Given how few people seem to have publicly posted about this, that's the path I'm going to go down.