cockroachdb / swiss

Go port of Google's Swiss Table hash table
Apache License 2.0
317 stars 12 forks source link

swiss: optimize bit clear #29

Closed thepudds closed 8 months ago

thepudds commented 8 months ago

When clearing a bit in our bitset in the inner loop of Get, Put, and Delete, we don't need to pass the index of the least significant bit we just found -- we can instead just clear the least significant bit.

Using the idiom of b = b & (b - 1) is simpler and faster than the current bit clearing code. This compiles down to a BLSRQ when building with GOAMD64=v3. For GOARM=7, it looks like it compiles to a LEAQ -1(AX), CX followed by ANDQ CX, AX.

Running the benchmarks seemed to show a geomean improvement of -1.80% sec/op on a GCE n1-standard-16 (with GOAMD64=v3 for both old and new), but it looked a bit noisy so I don't currently entirely trust that number.

cockroach-teamcity commented 8 months ago

This change is Reviewable

thepudds commented 8 months ago

geomean improvement of -1.80% sec/op [...] but it looked a bit noisy so I don't currently entirely trust that number.

Hi @petermattis, I can re-run the benchmarks maybe with a higher -count in the hopes of a less noisy result, but it takes a while to do that, so I was first curious if you are interested in targeted optimizations like this and my earlier #28.

No worries of course if you are not ready for contributions from random people.

To introduce myself briefly -- I was one of the people commenting on golang/go#54766, and I had an earlier SwissTable prototype implementation. These two PRs are things I had noticed in my own prototype. (I had mostly set my version aside when I had concluded I would need to do some more experimentations with different layouts like placing the control bytes with the KVs in each group to perhaps better align with performance expectations people might have built up based on the current runtime map layout... which is part of what was prompting me to ask those questions in https://github.com/golang/go/issues/54766#issuecomment-1951384322 last weekend).

Finally, thank you for taking the time to comment your implementation so clearly, and I'm more than thrilled to see your excellent work here looks like it might be en route to the Go runtime!

petermattis commented 8 months ago

Thanks for the contribution.

thepudds commented 8 months ago

Hi @petermattis, thanks!

One more quick suggestion is to consider using GOAMD64=v3 go test -c or similar for whenever you next update the benchmarks on the README. (In the inner loop, it also gives a more optimized bits.TrailingZeros64 in bitset.first).

petermattis commented 8 months ago

One more quick suggestion is to consider using GOAMD64=v3 go test -c or similar for whenever you next update the benchmarks on the README. (In the inner loop, it also gives a more optimized bits.TrailingZeros64 in bitset.first).

Thanks for the suggestion. TIL about the GOAMD64 env var.