Closed tmikus closed 1 year ago
Hiya! That's pretty awesome. Can you share your benchmark results? :-)
Out of curiosity, do you have an ultimate use case for the library? Like, what made you write the binding in the first place?
I do have some feedback.
Firstly, when I run go test
, I get this:
$ go test
go: errors parsing go.mod:
/home/andrew/clones/ahocorasick_rs/go.mod:3: invalid go version '1.21.0': must match format 1.23
Once I switched it to 1.21
, things worked. I was somewhat surprised, because I didn't think the Go toolchain knew how to build Rust libraries. Then I realized that you bundled pre-compiled libraries and hard-code paths to them:
/*
#cgo darwin,arm64 LDFLAGS: -L./lib/darwin -lahocorasick_rs_arm64
#cgo darwin,amd64 LDFLAGS: -L./lib/darwin -lahocorasick_rs_amd64
#cgo linux,arm64 LDFLAGS: -L./lib/linux -lahocorasick_rs_arm64
#cgo linux,amd64 LDFLAGS: -L./lib/linux -lahocorasick_rs_amd64
#cgo windows,amd64 LDFLAGS: -L./lib/windows -lahocorasick_rs_amd64
#include "./lib/ahocorasick_rs.h"
#include <stdlib.h>
*/
import "C"
I'm pretty sure this is no idiomatic. See for example how I structured Go bindings to Rust's regex crate (and the README contains build instructions): https://github.com/BurntSushi/rure-go
Other remarks:
ahocorasickkind
a sub-package? Why not just include the enum in the top-level package?AhoCorasick
searcher after they're done with it? AFAIK, the idiomatic solution here in Go-land is to use runtime.SetFinalizer
for deallocating resources across an FFI boundary.For reference, my benchmark results are:
$ GOMAXPROCS=1 go test -bench .
.
1 total assertion
..
3 total assertions
goos: linux
goarch: amd64
pkg: github.com/tmikus/ahocorasick_rs
cpu: 12th Gen Intel(R) Core(TM) i9-12900K
BenchmarkAhoCorasickGoCloudflare 928 1270858 ns/op
BenchmarkAhoCorasickGoBobuSumisu 618 1956127 ns/op
BenchmarkAhoCorasickRs 1648 727579 ns/op
PASS
ok github.com/tmikus/ahocorasick_rs 3.997s
My suspicion is that there should be cases where the other libraries are faster. IIRC, Go's FFI is quite expensive, so a benchmark on a very short haystack might fair better with a native Go library. They might be able to start and finish a search before your Rust wrapper can even start its search.
With all that said, it looks like you put a lot of work into this library. Very nicely done. Other than the Close
thing, the API looks very comprehensive. :-)
Perhaps it would make sense to hook it up to this benchmark? https://github.com/BobuSumisu/aho-corasick-benchmark
It has the benefit of being a benchmark not written by me or you. :-)
Thank you for the amazing feedback! I couldn't have hoped for a better review of this Work-in-progress library.
Out of curiosity, do you have an ultimate use case for the library? Like, what made you write the binding in the first place?
In my job we might soon be building a Go app that would benefit from having an access to a very fast AhoCorasick library, and after doing some comparison I found that yours was the best one I found. Building a new project with Rust woudn't go down too well with the leadership so I decided to go down the road of the least-resistance, and make the wrapper for Rust version of your library myself 😊
Firstly, when I run
go test
, I get this:$ go test go: errors parsing go.mod: /home/andrew/clones/ahocorasick_rs/go.mod:3: invalid go version '1.21.0': must match format 1.23
I suspect this is because of the minimum version of go in the go.mod
file being set to 1.21
, which actually can be lowered. I'll check what is the actual lowest version of Go this package needs.
I was somewhat surprised, because I didn't think the Go toolchain knew how to build Rust libraries. Then I realized that you bundled pre-compiled libraries and hard-code paths to them:
This is mostly because I don't want people to have to install Rust and manually compile this project just to run the library. I'll check that library you sent to see if there's a nicer way to accomplish this.
- Why do you ask the user to "close" the
AhoCorasick
searcher after they're done with it? AFAIK, the idiomatic solution here in Go-land is to useruntime.SetFinalizer
for deallocating resources across an FFI boundary.
Oh nice, I also don't like that Close()
function... Thank you for pointing me in the direction of SetFinalizer
!
Hiya! That's pretty awesome. Can you share your benchmark results? :-)
Out of curiosity, do you have an ultimate use case for the library? Like, what made you write the binding in the first place?
I do have some feedback.
Firstly, when I run
go test
, I get this:$ go test go: errors parsing go.mod: /home/andrew/clones/ahocorasick_rs/go.mod:3: invalid go version '1.21.0': must match format 1.23
Once I switched it to
1.21
, things worked. I was somewhat surprised, because I didn't think the Go toolchain knew how to build Rust libraries. Then I realized that you bundled pre-compiled libraries and hard-code paths to them:/* #cgo darwin,arm64 LDFLAGS: -L./lib/darwin -lahocorasick_rs_arm64 #cgo darwin,amd64 LDFLAGS: -L./lib/darwin -lahocorasick_rs_amd64 #cgo linux,arm64 LDFLAGS: -L./lib/linux -lahocorasick_rs_arm64 #cgo linux,amd64 LDFLAGS: -L./lib/linux -lahocorasick_rs_amd64 #cgo windows,amd64 LDFLAGS: -L./lib/windows -lahocorasick_rs_amd64 #include "./lib/ahocorasick_rs.h" #include <stdlib.h> */ import "C"
I'm pretty sure this is no idiomatic. See for example how I structured Go bindings to Rust's regex crate (and the README contains build instructions): https://github.com/BurntSushi/rure-go
Other remarks:
* Why is `ahocorasickkind` a sub-package? Why not just include the enum in the top-level package? * Why do you ask the user to "close" the `AhoCorasick` searcher after they're done with it? AFAIK, the idiomatic solution here in Go-land is to use [`runtime.SetFinalizer`](https://pkg.go.dev/runtime#SetFinalizer) for deallocating resources across an FFI boundary. * I'd suggest that your README include a section about why someone might want to use your library instead of a native Go library. "It's faster" is perhaps the easy answer, but it would be good to accompany that with data. There are likely other answers too. This aho-corasick library is, I believe, one of the most featureful in existence. The leftmost-first and leftmost-longest are, in particular, a somewhat unique (but not absolutely unique) offering.
For reference, my benchmark results are:
$ GOMAXPROCS=1 go test -bench . . 1 total assertion .. 3 total assertions goos: linux goarch: amd64 pkg: github.com/tmikus/ahocorasick_rs cpu: 12th Gen Intel(R) Core(TM) i9-12900K BenchmarkAhoCorasickGoCloudflare 928 1270858 ns/op BenchmarkAhoCorasickGoBobuSumisu 618 1956127 ns/op BenchmarkAhoCorasickRs 1648 727579 ns/op PASS ok github.com/tmikus/ahocorasick_rs 3.997s
My suspicion is that there should be cases where the other libraries are faster. IIRC, Go's FFI is quite expensive, so a benchmark on a very short haystack might fair better with a native Go library. They might be able to start and finish a search before your Rust wrapper can even start its search.
With all that said, it looks like you put a lot of work into this library. Very nicely done. Other than the
Close
thing, the API looks very comprehensive. :-)
Thanks, I'll work on adding benchmark section. I'll try to use someone else's benchmarks to avoid the pitfal of over-optimising code for my own benchmarks.
Perhaps it would make sense to hook it up to this benchmark? https://github.com/BobuSumisu/aho-corasick-benchmark
Great plan! I'll try to use that :)
Can you share your benchmark results? :-)
Here are my results on my desktop:
goos: linux
goarch: amd64
pkg: github.com/tmikus/ahocorasick_rs
cpu: AMD Ryzen 7 7800X3D 8-Core Processor
BenchmarkAhoCorasickGoCloudflare-16 914 1303852 ns/op
BenchmarkAhoCorasickGoBobuSumisu-16 585 2061054 ns/op
BenchmarkAhoCorasickRs-16 1446 797147 ns/op
PASS
ok github.com/tmikus/ahocorasick_rs 3.985s
Nice sounds good!
This is mostly because I don't want people to have to install Rust and manually compile this project just to run the library. I'll check that library you sent to see if there's a nicer way to accomplish this.
Yeah I figured as much. My suspicion is that folks won't be expecting a binary blob they didn't build to get linked into their Go build automatically. So it might wind up making folks want to use your library less than if you required a Rust toolchain. I'm not 100% positive, but that's my sense of the zeitgeist. :-)
FYI, turns out even despite the FFI penalty the wrapper of your library is still faster than than any other listed in https://github.com/BobuSumisu/aho-corasick-benchmark 😁
Here are the results from my macbook:
name patterns build search matches alloc
anknown 1000 0.75ms 2.34ms 407 0.06GiB
bobusumisu 1000 0.72ms 0.55ms 407 0.07GiB
cloudflare 1000 6.43ms 0.22ms 9 0.12GiB
iohub 1000 0.37ms 0.41ms 407 0.12GiB
tmikus 1000 1.50ms 0.23ms 388 0.12GiB
anknown 2000 1.49ms 2.26ms 413 0.12GiB
bobusumisu 2000 1.82ms 0.54ms 413 0.13GiB
cloudflare 2000 18.82ms 0.27ms 13 0.23GiB
iohub 2000 0.72ms 0.40ms 413 0.23GiB
tmikus 2000 1.97ms 0.22ms 388 0.23GiB
anknown 4000 3.11ms 2.31ms 1429 0.24GiB
bobusumisu 4000 3.44ms 0.57ms 1429 0.26GiB
cloudflare 4000 29.64ms 0.25ms 45 0.45GiB
iohub 4000 1.39ms 0.48ms 1429 0.45GiB
tmikus 4000 3.43ms 0.24ms 972 0.45GiB
anknown 8000 6.73ms 2.44ms 3485 0.46GiB
bobusumisu 8000 10.16ms 0.67ms 3485 0.50GiB
cloudflare 8000 51.83ms 0.30ms 86 0.89GiB
iohub 8000 2.77ms 0.63ms 3485 0.89GiB
tmikus 8000 6.56ms 0.29ms 2303 0.89GiB
anknown 16000 16.11ms 2.63ms 7977 0.92GiB
bobusumisu 16000 19.34ms 0.86ms 7977 0.99GiB
cloudflare 16000 98.52ms 0.35ms 173 1.77GiB
iohub 16000 6.27ms 1.06ms 7977 1.78GiB
tmikus 16000 12.71ms 0.36ms 5203 1.78GiB
anknown 32000 28.66ms 2.68ms 10025 1.83GiB
bobusumisu 32000 31.09ms 0.93ms 10025 1.97GiB
cloudflare 32000 169.21ms 0.43ms 262 3.50GiB
iohub 32000 12.86ms 1.27ms 10025 3.52GiB
tmikus 32000 25.36ms 0.42ms 6280 3.52GiB
anknown 64000 60.89ms 2.82ms 12505 3.62GiB
bobusumisu 64000 48.42ms 1.12ms 12505 3.88GiB
cloudflare 64000 333.61ms 0.53ms 526 6.85GiB
iohub 64000 25.67ms 1.42ms 12505 6.89GiB
tmikus 64000 49.33ms 0.43ms 7165 6.90GiB
anknown 128000 147.33ms 4.54ms 39334 7.09GiB
bobusumisu 128000 121.79ms 2.46ms 39334 7.62GiB
cloudflare 128000 761.86ms 1.31ms 1141 13.46GiB
iohub 128000 56.81ms 3.51ms 39300 13.54GiB
tmikus 128000 104.42ms 1.22ms 21913 13.55GiB
anknown 256000 341.93ms 6.23ms 59391 13.95GiB
bobusumisu 256000 209.34ms 3.91ms 59391 14.99GiB
cloudflare 256000 1645.95ms 2.03ms 2243 26.84GiB
iohub 256000 122.38ms 5.90ms 58923 26.99GiB
tmikus 256000 235.84ms 1.79ms 29451 27.01GiB
anknown 512000 851.71ms 8.61ms 94000 27.93GiB
bobusumisu 512000 419.63ms 6.07ms 94000 30.01GiB
cloudflare 512000 4240.75ms 3.07ms 4490 53.69GiB
iohub 512000 264.89ms 9.36ms 91986 53.98GiB
tmikus 512000 526.98ms 2.02ms 38269 54.03GiB
Yeah I would expect that I think. That benchmark isn't latency sensitive because the haystack is big enough. You'll likely need to shrink the unit of work (i.e., use a small haystack) in order to observe FFI overhead impacting benchmarks.
You'll likely need to shrink the unit of work (i.e., use a small haystack) in order to observe FFI overhead impacting benchmarks.
Thanks, I'll look into that this week after work 😊
On a similar note, have you seen this blog post about how to make Go FFI calls much faster? https://words.filippo.io/rustgo/
Yes. I read it when it was published. Whether folks will want to go through that and rely on it for faster multi-substring search remains to be seen. :-)
To be clear, I'm not talking about FFI overhead from the lens of "this is something you should fix." I'm talking about it because it's important to present as full a picture as you can when discussing performance.
I just made a slightly modified version of BobuSumisu's benchmark, which tests all libraries against different length input strings: https://github.com/tmikus/aho-corasick-benchmark
The TLDR; is that for inputs below 3k characters it doesn't really matter which Go library one picks. The results are so small that even run-to-run variation can affect the score. For inputs above 3k this wrapper is clearly pulling into the lead.
I feel like chasing the gains provided by the approach from https://words.filippo.io/rustgo/ is not really worth it so I might just leave this project in the state that it is currently, as it's clearly good enough.
Thank you for spending your valuable time on helping me with it!
Perfect! If you want to submit a PR adding a link to your project to the README, that would be great. :-) I would say put it in a new section in the bottom.
Hey Andrew,
I hope you find this useful - I just made a Go wrapper for your amazing library: https://github.com/tmikus/ahocorasick_rs.
It's incredible how much faster your library is compared to the one available in Go, so I thought I might spread your awesomeness to that language. You're welcome to add a link to it in the "FFI bindings" section if you feel like the library is up to the scratch.
In my very simplified benchmark it turned out that the wrapper around your library is much faster than both https://github.com/cloudflare/ahocorasick and github.com/BobuSumisu/aho-corasick