flier / rust-hyperscan

Hyperscan bindings for Rust with Multiple Pattern and Streaming Scan
86 stars 28 forks source link

Support for ARM on macOS and AWS Graviton? #20

Open jhecking opened 2 years ago

jhecking commented 2 years ago

I saw that there was a PR #16 earlier to support ARM using the Kunpeng port. But was wondering if anyone can confirm whether Rust Hyperscan can be used on ARM on macOS M1 and/or AWS Graviton processors using either that port or the Vectorscan port?

bradlarsen commented 1 year ago

Hi @jhecking. I know this issue is pretty old now, but I can confirm that the rust-hyperscan bindings do indeed work on macOS on an M1.

I have been investigating the feasibility of using Vectorscan for Nosey Parker, a scanner for hardcoded passwords that I was just able to release as open-source: https://github.com/praetorian-inc/noseyparker/issues/5.

This process is not well documented, but this is what has worked for me:

  1. Build the Vectorscan library from source on my M1 machine, with a build directory at $BUILD
  2. Set the HYPERSCAN_ROOT environment variable to $BUILD (see https://github.com/flier/rust-hyperscan/blob/b42c421fec7b24ace432c613f72deed76f4a7f2d/hyperscan-sys/build.rs#L11)
  3. Build Nosey Parker, which uses rust-hyperscan as a dependency, with that environment variable set

Doing this, I was able to get Vectorscan, rust-hyperscan, and Nosey Parker working on my M1 Mac, and it all seems to work as expected.

This build process is cumbersome, requiring many steps from someone trying to replicate it. I have been experimenting with adding a bundled-vectorscan crate feature to rust-hyperscan so that the crate would build the entire Vectorscan library from source when requested, and use that. This works, but it's not perfect, since a number of dependencies of Hyperscan/Vectorscan still must be present on the system (notably Ragel, cmake, and Boost). But perhaps it would still be useful even without vendoring all those dependencies into rust-hyperscan.

@flier is this bundled-vectorscan feature something you would be interested in having as a pull request and incorporating in the crate?

jhecking commented 1 year ago

Thanks for the update, @bradlarsen!

flier commented 1 year ago

@flier is this bundled-vectorscan feature something you would be interested in having as a pull request and incorporating in the crate?

Hi @bradlarsen ,

I think we may add a vectorscan-sys like sub-crate and let's hyperscan depend on it?

bradlarsen commented 1 year ago

I think we may add a vectorscan-sys like sub-crate and let's hyperscan depend on it?

@flier: That was my initial thought: create a new vectorscan-sys library. But looking at this idea closer, I realized that the API that the vectorscan library exposes is the same as the the API from the hyperscan library. So to use vectorscan instead of hyperscan in the hyperscan-sys crate, it's a difference in linking.

We could create a new vectorscan-sys crate with a bundled option that would build vectorscan from source, but it would be identical to the hyperscan-sys crate except for the build.rs and crate metadata.

Maybe this is still the best way to go? It looks like most of the hyperscan-sys crate is automatically generated.

So what do you think of this proposal for a PR (before I spend time implementing it)?

flier commented 1 year ago

@bradlarsen

If the API is exactly the same, I believe a new vectorscan feature in hyperscan-sys will be good enough, the build.rs could choose to link hyperscan or vectorscan static library on the build time, pkg-config or HYPERSCAN_ROOT/VECTORSCAN_ROOT could help us to use system installed or customized build.

In any case, if you prefer to introduce a new bundled feature, that's fine with me, but it's better to be an optional feature, because it is really slow. ┓( ´∀` )┏

Thanks

bradlarsen commented 1 year ago

@jhecking fyi, another library provides vectorscan bindings, which builds on non-x86: https://github.com/vlaci/pyperscan

That library is, as far as I know, used as an internal dependency in unblob, which is a Python program. But it uses Rust in a few places, including exposing Hyperscan or Vectorscan to Python.