intel / hyperscan

High-performance regular expression matching library
https://www.hyperscan.io
Other
4.71k stars 705 forks source link

Recommendations for running hyperscan on multicore setup #407

Open venkatsvpr opened 1 year ago

venkatsvpr commented 1 year ago

Hello,

This is not directly a issue on the hyperscan package. Posted it here if incase folks have some suggestions.

I'm currently experimenting with using Hyperscan for regex matching in a test service written in Go. To interact with the Hyperscan library, I'm using the flier/gohs package.

So far, I've had a great experience using Hyperscan on a single-core machine. However, when I tried running it on a multi-core machine, I didn't observe any significant performance improvements. You can find more details about this issue in this link: https://github.com/flier/gohs/issues/56.

The tests in the link are benchmark tests written in Go. I do see from the blog that the performance increases linearly with the cores. https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-hyperscan.html).

I am unable to understand why the perf doesn't increases linearly. I'm not sure if I'm missing any configuration or if I'm testing it correctly.

Do you have any suggestions?

Thank you!

variar commented 1 year ago

I use hyperscan from c++ application. To utilize several cores I split the input between them, and then merge the results. This way I do see almost linear speedup.

hongyang7 commented 11 months ago

@venkatsvpr Hyperscan is a pure single-core matching library, not involving any multi-thread design. The scalability on multi-core generally means when facing multiple inputs and one pattern set, different core is for different scanning thread on different input, but with shared copy of the same bytecode (read-only).