Open GiedriusS opened 17 hours ago
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
cc @mknyszek
There's no internal limit, this is just a bug. ("Run out of hash bits" means that the implementation would've expected a hash collision given that it just walked over (what it thinks is) the maximum depth of the tree and ran out of hash bits. Hash collisions are still handled, though.)
Looking into it.
I'll try to add an atomic counter to see how many calls it takes to trigger this panic
My suspicion is an off-by-one error in creating new nodes. Like, the hash matches down to the last 4 bits, but the last 4 don't and we give up too early. It occurs to me that this codepath isn't well-tested because actual collisions (which are well-tested) short-circuit to appending onto the overflow list for a node.
I'll try to write a test for this case and see if that's the problem.
Yeah, actually, I'm feeling more confident in this diagnosis. If it's the case, I should have a fix and a test shortly.
EDIT: No, that doesn't seem to be it.
I took a brief look at https://github.com/planetscale/vtprotobuf/pull/140 and I noticed that you're using unsafe.String
to pass a byte slice through to unique.Make
. This is fine, but one thing that occurs to me is that on insert, the node we're colliding with is re-hashed. If the byte slice you passed to unique.Make
is mutated prior to this commit then it's possible that that produces a different hash.
It already looks like you already tried at tip though, past that commit, and ran into the same issue. Can you confirm that?
I took a brief look at planetscale/vtprotobuf#140 and I noticed that you're using
unsafe.String
to pass a byte slice through tounique.Make
. This is fine, but one thing that occurs to me is that on insert, the node we're colliding with is re-hashed. If the byte slice you passed tounique.Make
is mutated prior to this commit then it's possible that that produces a different hash.It already looks like you already tried at tip though, past that commit, and ran into the same issue. Can you confirm that?
Yeah, that's why I tried the tip version. Without that commit the panic occurs within a few seconds of me triggering that code path. With that commit it's a bit harder and occurs in 30-60 seconds.
If you have any instructions to reproduce, or if you're willing to share a core file or something (so that I can inspect the state of the HashTrieMap on crash) that would help me iterate on this faster.
I can also write a small patch to dump the state of the HashMapTrie on crash, if you're willing to run that and provide the output. (EDIT: If it helps, I can remove the actual key/value output, I think I just need the hash anyway.)
Change https://go.dev/cl/614475 mentions this issue: unique,internal/concurrent: add some more tests
Go version
go version devel go1.24-3d33437c45 Sun Sep 15 02:05:37 2024 +0000 linux/amd64
Output of
go env
in your module/workspace:What did you do?
Trying to intern all strings coming through gRPC using the new unique.Make(): new vtprotobuf extension. It seems like it is very easy to break some limits? The machine has 128 cores and 256GiB+ of RAM so resources shouldn't be a problem.
I tried creating a toy program that interns random strings and couldn't reproduce it, unfortunately.
What did you see happen?
I see this after some time:
Same happens with 1.23.1.
What did you expect to see?
I expected no panic.