Closed vmihailenco closed 2 years ago
Can you demonstrate with a test case or benchmark how this leaks the argument? Note, the Hash and HashString methods are carefully designed to be inlined by the compiler.
I've added a benchmark. PTAL
How about adding it to HashSeed, Hash128 and Hash128Seed?
@zeebo what do you think?
To my understanding, this change appears actually unsafe. It's intentionally breaking the unsafe.Pointer
rules, specifically rule 3, to hide a pointer. In https://go.dev/cl/86976/ the resulting pointer is only used for the copy check and is never dereferenced.
I'll note that the escape seems to be due to the function indirection and can be avoided by restructuring as below. Though it's unclear to me whether the indirection is itself an optimization.
func Hash(b []byte) uint64 {
if len(b) <= 16 {
return hashSmall(*(*str)(ptr(&b)))
}
return hashMed(*(*str)(ptr(&b)))
}
It seems this is a no-win situation. With the current implementation, it escapes, with @vcabbage the call is not inlined, which penalizes byte-slices.
What about going another direction and merge hashSmall and hashMed. We already have a length check, and it has to be performed anyway, ending up with...
func Hash(b []byte) uint64 {
return hashMed(*(*str)(ptr(&b)))
}
// Hash returns the hash of the string slice.
func HashString(s string) uint64 {
return hashMed(*(*str)(ptr(&s)))
}
func hashMed(s str) (acc u64) {
p, l := s.p, s.l
switch {
case l <= 16:
switch {
case l > 8:
...
This works fine to avoid the escape:
BenchmarkHashEscape-32 376313569 3.205 ns/op 0 B/op 0 allocs/op
So far so fine. However, even it appears that it causes a slight regression in small payloads on my machine:
Before:
BenchmarkFixed64/8/default-32 423928652 2.781 ns/op 2876.96 MB/s
After:
BenchmarkFixed64/8/default-32 389967436 3.086 ns/op 2592.35 MB/s
Inspecting asm doesn't reveal anything to me at least. I tried aligning the jump target for the < 16, so I can't really explain it. Could also just be my specific processor.
I'm playing around with removing the call through the function pointer as well. One thing I'm going to experiment with is in go1.18, switches are potentially compiled differently as jump tables sometimes. Maybe some of the losses can be recouped there.
It's intentionally breaking the unsafe.Pointer rules
My understanding is that this change can't interfere with unsafe rules, because it does not change the generated code in this package. It only hacks escape analysis on the calling side of Hash
func to not allocate the passed arg on the heap. If the arg is not used/referenced by xxh3 after Hash
function exits, this change should be safe.
I'm not an authority on this, so take my thoughts with the appropriate grain of salt 🙂
My understanding is that this change can't interfere with unsafe rules, because it does not change the generated code in this package.
The noescape
function doesn't generate any code directly, but changing escape analysis can change the generated code. Heap allocating or not is part of the code.
It only hacks escape analysis on the calling side of Hash func to not allocate the passed arg on the heap. If the arg is not used/referenced by xxh3 after Hash function exits, this change should be safe.
If there were no more references to b
in the calling code, could it end up being collected while the hash function is executing? I don't know with any certainty, but with the unsafe rules broken it wouldn't surprise me.
Apparently the usage in strings.Builder
has caused an issue. I imagine the experienced folks that used it in strings.Builder
weren't expecting that, highlighting that the USE CAREFULLY!
comment shouldn't be overlooked.
At the very least, if it's possible to avoid the allocation without using such a tricky function that seems preferable.
So I experimented quite a bit with shuffling the functions around to avoid the call through the pointer, and here's a table summarizing the results:
There's a lot to like about that: sizes above 16 are pretty much entirely wins, sometimes on the order of 15%. On the other hand, sizes smaller than 16 are sometimes worse. But while the percents are large, in absolute terms, they account for less than 1ns. Like, the size 3 that gained 20% went from 2.8ns/op to 3.35ns/op.
So I think this is acceptable, and I'm gonna go with it rather than the "noescape" trick. As noted above, the output code being the same doesn't imply the runtime metadata the garbage collector uses is, and the fact that it can be attributed to causing some invalid GC state makes me even more wary.
Thanks for noticing this and providing the test case to demonstrate it.
Thanks for fixing this.
Hard to say if this would help with performance, but I wonder why do you convert everything to str
and not use []byte
as is? That would remove unsafe entirely for the common case of hashing bytes.
For strings, you could use unsafe to cast string
to []byte
.
The code used to pass two parameters: the data pointer and length. It was changed to pass the str
struct because that made the inliner happy. Now that the body of the functions are simpler again, it could go back to doing more "normal" unsafe stuff, but there is no difference.
It would be incorrect to use unsafe to interpret a string
as a []byte
. A []byte
is larger than a string
: it has a data pointer, a length and a capacity, whereas the string
only has the data pointer and length.
str
has the same memory layout as a string
and the same memory layout as the first two fields of a []byte
, so unsafe casting both to the same thing allows sharing the same code to process both.
Now that the body of the functions are simpler again
I meant to use []byte
and restore optimization for small sizes. Use []byte
, no unsafe, and no indirection.
It would be incorrect to use unsafe to interpret a string as a []byte
Every second library has a function to cast string
to []byte
using unsafe. Perhaps you mean slower instead of incorrect.
// Bytes converts string to byte slice.
func Bytes(s string) []byte {
return *(*[]byte)(unsafe.Pointer(
&struct {
string
Cap int
}{s, len(s)},
))
}
I don't understand. The main performance loss is because the check for smaller/larger than 16 bytes is no longer inlined. It was inlined because of the way it called through a function pointer. None of that has to do with unsafe, and I don't see how keeping the value as a []byte
would help.
I meant doing *(*[]byte)(unsafe.Pointer(&someString))
is incorrect. I see no benefit to using a []byte
as opposed to a string
in this case because the function does not mutate it and does not need to pass it to something that takes a []byte
.
Go code that uses
unsafe.Pointer
requires this hack or[]byte
arg always escapes to heap. cespare/xxhash uses asm withgo:noescape
instead.See https://github.com/golang/go/issues/23382 for an example from stdlib