VectorCamp / vectorscan

A portable fork of the high-performance regular expression matching library
https://www.vectorcamp.gr/project/vectorscan/
Other
503 stars 54 forks source link

Speed up truffle with 256b TBL instructions #282

Closed ypicchi-arm closed 4 months ago

ypicchi-arm commented 4 months ago

256b wide SVE vectors allow some simplification of truffle. Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s on the microbenchmark. SVE2 also offer this capability for 128b vector with a speedup around 25% compared to normal SVE

Add unit tests and benchmark for this wide variant

ypicchi-arm commented 4 months ago

Giving it one last check I just spotted some old code that crept into the commit. Don't merge it yet, I'll fix "dumpTruffleCharReach32" tomorrow.

markos commented 4 months ago

Unfortunately I couldn't even if I wanted to: https://buildbot-ci.vectorcamp.gr/#/changes/226

many failures even on Arm. If you want I can take a look and fix some of those.

ypicchi-arm commented 4 months ago

Ack. Don't worry, I'll look and fix those at the same time tomorrow. I tend to only test with gcc locally.

ypicchi-arm commented 4 months ago

The CI fail to pick up correctly my changes after the force push. I'll re-create the PR to have a clean run