go4org / intern

BSD 3-Clause "New" or "Revised" License
117 stars 4 forks source link

Potentially unsafe assumptions around finalizers and optimizing compilers #13

Open mdlayher opened 3 years ago

mdlayher commented 3 years ago

Reported by @bcmills on Gophers Slack #darkarts:

@mdlayher, that interning trick is cute, but wrong in a couple of ways I think. For a start, the uintptr conversion should happen after the call to runtime.SetFinalizer, not before. Even with the current memory model, the Value could be allocated on the stack before the finalizer is set, then moved to the heap when the call to SetFinalizer happens.

At any rate: @mdlayher, the deeper way that it's wrong is that an optimizing compiler — even one with a nonmoving garbage collector! — can theoretically notice that the resurrected field is only ever accessed on invalidly-converted uintptr values, which (from the unsafe.Pointer rules) it can assume is not allocated in the Go heap, and from there it could notice that the if v.resurrected condition in the finalizer is statically false (because the finalize function is only ever invoked as a finalizer on Go-allocated memory) and eliminate it as dead code. (edited) I do not expect the production Go compiler to ever actually implement that optimization, but it is certainly not forbidden. Which is to say: your unsafe shenanigans depend on a lot more assumptions than just “non-moving GC”.

@josharian also spoke up a bit later (and he knows a lot more about this repo and its tricks than I do):

Yes, weak refs are exactly what are missing. Ian claimed (in the comment linked to in the internal comments in package intern) that weak refs and finalizer are equivalent in strength, and gave an example of how to convert a finalizer into a weak ref. But his translation doesn’t work for interning. Maybe there’s a different way that’s safe—that’d be great! But it looks to me at this moment like they’re not equivalent, which maybe re-opens the question of whether Go should have weak refs.

And more context from Brian:

Finalizers and weak references are not equivalent due to the built-in hashing and equal operations. https://golang.org/issue/32779#issuecomment-526179756 (edited)

Honestly, though, most of the time I'm interning strings I don't care about ever garbage-collecting them until some epoch (like the end of the lifetime of some other associated data structure), at which point I can throw away all of the interned strings. But for that use-case, sync.Map should already function as a perfectly serviceable interning table. (LoadOrStore is exactly the interning operation.)

So you could imagine, say, something analogous to sync.Map but with the values lazily (rather than eagerly) copied from the read-only side on the first use after a new map has been promoted, perhaps with some atomic value indicating whether the copy has already occurred (to avoid cache-thrashing on subsequent accesses). (edited) But that's probably more #performance than #darkarts, since there's no unsafe needed. (But maybe folks consider atomic to be #darkarts too?)

I don't know if much of this is immediately actionable (aside from the first point about moving the uintptr conversion) or just something to be aware of. And perhaps this gives more motivaiton to push for weak references as Josh mentioned.

josharian commented 3 years ago

Bryan's point is good, but theoretical. For practical purposes, this is the key sentence:

I do not expect the production Go compiler to ever actually implement that optimization

Perhaps the magic "no moving GC" package should disappear in favor of a more mundane "verified in 1.16" package. Hopefully weak refs will be added because some compiler breaks package intern as it stands.

mdlayher commented 3 years ago

See also https://github.com/golang/go/issues/43615.