VectorCamp / vectorscan

A portable fork of the high-performance regular expression matching library
https://www.vectorcamp.gr/project/vectorscan/
Other
503 stars 54 forks source link

Speed up truffle with 256b TBL instructions #290

Closed ypicchi-arm closed 4 months ago

ypicchi-arm commented 4 months ago

256b wide SVE vectors allow some simplification of truffle. Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s onhe microbenchmark. SVE2 also offer this capability for 128b vector with a speedup around 25% compared to normal SVE

Add unit tests and benchmark for this wide variant

ypicchi-arm commented 4 months ago

Continuation of #282 I'm hoping to get the CI to actually validate the changes