Open bcmills opened 1 year ago
Overall, this looks good to me. A few minor questions:
The address must remain valid for as long as any Go pointer (of type
Pointer
or any other pointer type) refers to it.
Do we need this requirement? I don't think the GC will follow non-Go pointers, so I think this should be fine as long the application doesn't dereference the pointer when it is invalid. (I am imagining a library that temporarily unmaps (or more likely, protects PROT_NONE) pages during transient periods when they should not be accessed).
The compiler's -d=checkptr mode would check conversions from uintptr to unsafe.Pointer. The conversion may throw at run-time if: the uintptr refers to an invalid (unmapped) nonzero address.
How do we implement this? Add an unconditional dereference of the pointer? Does a PROT_NONE-protected page count as "valid"? If so, a dereference isn't sufficient, you'd need to either try to map over the page, or parse /proc/self/maps.
Do we need this requirement? I don't think the GC will follow non-Go pointers
I think the bad pointer in Go heap
check will trigger if an unsafe.Pointer
variable whose storage is itself in Go memory contains a value that is too bogus (like maybe too close to 0, or formerly-but-no-longer part of the Go heap)?
Plus, if the non-Go allocator does actually unmap the address (instead of mapping it PROT_NONE
), then it's possible the Go allocator will later make use of that address while the pointer still exists.
I would be ok with weakening the requirement to “must remain mapped”, but I'm not sure that I want to try to define “mapped” portably. 😅
How do we implement this?
Heavy emphasis on the “may”?
I was thinking we would call runtime.findObject
and let it do its thing: if it calls badPointer
so be it, and if it reports a nonzero base
we fail with a checkptr
error.
At first glance updating vet for this means simply permitting all conversions from uintptr
to unsafe.Pointer
, because in general vet has no idea whether the value is valid. Can we do better?
"This address must remain valid" seems both too strong and not strong enough. What actually matters is that the address is never allocated by the Go runtime. I'm not sure how to phrase that, though.
At first glance updating vet for this means simply permitting all conversions from
uintptr
tounsafe.Pointer
I agree that's the simplest change to make to it. Part of the intent is to shift the burden of enforcement from a relatively noisy static check (unsafeptr
) to a more precise dynamic one (checkptr
, which is enabled by default under the race detector).
It is possible in theory for vet
to continue to notice and flag cases where the uintptr
is always derived from a pointer to Go memory, but I'm not sure it would be worth the effort to implement given that checkptr
could detect those cases with very high precision.
Also, the large number of vet
warnings in x/sys
suggests that the existing vet
check is probably not widely used today. I'm not sure that we would lose much in gutting it.
What actually matters is that the address is never allocated by the Go runtime. I'm not sure how to phrase that, though.
Maybe:
“A uintptr
containing a memory address allocated outside of Go
may be converted to Pointer
. The address must remain unavailable
to the Go runtime (for example, due to an allocation using cgo or
syscall.Mmap
) for as long as any Go pointer (of type Pointer
or any other pointer type) refers to it.”
The address must remain unavailable to the Go runtime (for example, due to an allocation using cgo or syscall.Mmap) for as long as any Go pointer (of type Pointer or any other pointer type) refers to it.
Tying these two things together worries me. Having a program be incorrect if it leaves behind a pointer it never dereferences to some memory region that was unmapped and remapped by the Go runtime feels too subtle to me. I'm also not sure how to implement a dynamic check for this case since there's no way the runtime could identify that a pointer just sitting there previously referred to non-Go memory.
This is something of an insight from the arena proposal: it's really not safe to just unmap memory regions that Go pointers reference. The arena implementation waits until the GC can't find a Go pointer to the region before allowing that address space to be reused.
One way we can make this OK is to support GC-managed memory regions, where the GC will let you know (via a finalizer or similar mechanism) that it's really OK to unmap a region. (Side note: this is actually fairly easy to implement today, if we require/guarantee that memory regions are aligned to 8 KiB, and are at least 8 KiB in size. Unfortunately our most popular platforms all have 4 KiB physical pages...)
Having a program be incorrect if it leaves behind a pointer it never dereferences to some memory region that was unmapped and remapped by the Go runtime feels too subtle to me.
That's why it's unsafe
! 😅
I'm also not sure how to implement a dynamic check for this case since there's no way the runtime could identify that a pointer just sitting there previously referred to non-Go memory.
If the GC rate is high enough, eventually the pointer will be unallocated when the GC scans it and trip the badPointer
check, no?
This is something of an insight from the arena proposal: it's really not safe to just unmap memory regions that Go pointers reference.
Agreed, but I expect that in the vast majority of cases, the addresses converted from uintptr
to unsafe.Pointer
will never be unmapped and it will be a non-issue.
On the other hand, to the extent that it is a problem under this proposal, it is already a problem under cgo
in general today: suppose that a C function allocates a memory region using mmap
, passes that pointer into a C-exported Go function (perhaps as type void*
a.k.a. unsafe.Pointer
), and unmaps it after the Go function returns. That's exactly the same problem without involving uintptr
conversions, no? If the runtime happens to acquire that address in a subsequent allocation, any lingering pointers may trigger the badPointer
check during the next GC cycle.
(CC @dominikh @timothy-king)
That's why it's
unsafe
! 😅
Haha, yeah I suppose so. :)
If the GC rate is high enough, eventually the pointer will be unallocated when the GC scans it and trip the
badPointer
check, no?
That's probably true. There are a few different failure modes and outcomes possible in today's GC (unclear what it might look like under different GC implementations). There's just varying levels of unsafety here and since this is one of those things that relies on pointers existing, so it's especially hard to track down the issue from just the error you get. Maybe that's just what we're willing to say is what happens with cgo
and users should defensively nil out references to any non-Go mappings before unmapping them.
Agreed, but I expect that in the vast majority of cases, the addresses converted from
uintptr
tounsafe.Pointer
will never be unmapped and it will be a non-issue.
Also probably true. I guess at the very least I'd argue for some strongly worded advice pointing out that this is really the only case that's straightforward to reason about, until we have some way for the GC to notify code that some memory region is no longer referenced.
On the other hand, to the extent that it is a problem under this proposal, it is already a problem under
cgo
in general today: ...
That's a good point. Though, I'd argue we should just offer a solution to that problem (independently of this proposal).
Lastly, I'm paranoid that we're codifying a constraint on GC implementations with this. It's ensuring that the GC always has to be somewhat tolerant of what actually is in a pointer slot. The runtime definitely relies on this property internally in the current implementation, but how does it restrict future implementations? And maybe I'm wrong and we've already codified this property, making it a moot point? I don't know yet either way.
Maybe that's just what we're willing to say is what happens with
cgo
and users should defensively nil out references to any non-Go mappings before unmapping them.
FWIW, as far as I can tell this is exactly analogous to the cgo pointer-passing rules: “C code may not keep a copy of a Go pointer after the call returns.” It just happens to run in the other direction too: if a C-like program passes a pointer to a Go function, the Go code may not keep a copy of the C pointer either. Fair's fair, after all! 🙃
I'm paranoid that we're codifying a constraint on GC implementations with this. It's ensuring that the GC always has to be somewhat tolerant of what actually is in a pointer slot. … And maybe I'm wrong and we've already codified this property, making it a moot point?
Between cgo
and syscall.Mmap
, I think we already have to have that tolerance, and I think it's too ingrained to ever remove at this point.
It may be that we could get by with requiring that the uintptr
is actually a mapped address (not just “outside the Go heap and sufficiently far from zero”), but as @prattmic noted that may be harder to enforce precisely.
If we say "uintptr to unsafe.Pointer is OK for non-Go pointers" then I think there's almost no way that vet could tell with any precision when to report an error. Any pointer being converted to uintptr and back might have come from C. The only clear case to flag would be:
x := uintptr(unsafe.Pointer(&v)) x += 2 p := unsafe.Pointer(x)
but that doesn't seem very common.
Is there something we can do to flag bad code in the vet check?
Perhaps we should run the unsafeptr vet check across the whole open source ecosystem?
If we say "uintptr to unsafe.Pointer is OK for non-Go pointers" then I think there's almost no way that vet could tell with any precision when to report an error.
Agreed — for vet
to report an error it would have to be able to prove that a pointer may alias a Go allocation.
Is there something we can do to flag bad code in the vet check?
I think we should more-or-less give up on the vet
check. It has so many false positives today that I doubt anyone uses it in codebases of significant size.
More importantly: it should be straightforward to implement a much more precise checkptr
check for these conversions, and checkptr
is already enabled by default in -race
mode. So we would lose (static, imprecise) coverage from vet
, but gain a comparable amount of (dynamic, precise) coverage from checkptr
.
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
Alternatives
The
vet
warning can be worked around today by relying on a liberal reading of the “equivalent memory layout” rule, rewriting
Fwiw, I worked around this in the past with a module that just has
func IntoUnsafePointer(v uintptr) unsafe.Pointer {
return unsafe.Pointer(v)
}
Then in my main module I have vethack.IntoUnsafePointer(v)
.
Perhaps the go stdlib could provide something like unsafe.UnsafePointerFromNonGoPointer
or a better named function? I could also imagine a runtime check at some point to assert that this pointer is not managed by the go gc
Another way to solve the issue would be some kind of address space or memory tagging that makes sure that Go can still be a memory safe language and more of such hacks to make it even less memory safe can be avoided.
That would allow using normal structs again and track the data path in static verification tools and maybe dynamic runtime parts to support this.
I am aware that this might lead to a counter proposal, but want to know if that line of thought has been considered as I was unable to find that in this discussion.
Another way to solve the issue would be some kind of address space or memory tagging that makes sure that Go can still be a memory safe language and more of such hacks to make it even less memory safe can be avoided.
@nightlyone, sorry, but I don't see how that's relevant to this proposal.
Existing system calls and existing C functions may return pointers to parts of the address space outside of what is managed by the Go runtime. No change to the Go runtime or toolchain can change that fact.
The checkptr
dynamic check may already look up addresses managed by the Go runtime, check the runtime's metadata about them, and flag misuses of unsafe.Pointer
based on that metadata. Part of this proposal is to do so more aggressively, rather than relying on a noisier vet
check to flag those misuses.
The conversion may throw at run-time if: [...] the uintptr refers to an invalid (unmapped) nonzero address.
I'm not convinced this should fail. Whether an address is valid is a dynamic property that can change over the lifetime of the address space, whereas the same Go pointer value could remain live across those changes. It seems like a semantic mismatch to force validation at the initial conversion to unsafe.Pointer, if we're not going to re-validate them on address space changes.
I was thinking we would call runtime.findObject and let it do its thing: if it calls badPointer so be it, and if it reports a nonzero base we fail with a checkptr error.
FWIW, I believe this is what checkptr does already, unless I misunderstand what you mean: https://github.com/golang/go/blob/master/src/runtime/checkptr.go#L50
I believe this is what checkptr does already
Neat! Then that part is already taken care of. 😃
I guess under this proposal we would have to relax the check for pointers near 0, though: https://github.com/golang/go/blob/c63066123bb5c2ef7a6d26d1a6e3e5f1012a1e23/src/runtime/checkptr.go#L51-L53
(Instead, perhaps we could check that if each of the originals
has a nonzero base, then the result must also have a nonzero base?)
Whether an address is valid is a dynamic property that can change over the lifetime of the address space, whereas the same Go pointer value could remain live across those changes.
Agreed. I'm leaning toward the variation in https://github.com/golang/go/issues/58625#issuecomment-1440188404; I'll update the top comment to use that.
Rewording again:
“A uintptr containing a memory address allocated outside of Go may be converted to Pointer. Any such Pointers must no longer be reachable from live Go values before that memory can be returned to its allocator."
It is unclear how the runtime can actually guarantee this - there are difficult implementation questions to eliminate all possible races. But it seems like the above is the model we'd want to give users, if we were going to guarantee something.
Given the complexities here, maybe we don't want to guarantee anything? What are the valid use cases for this other than mmap? We could say code should use syscall.Mmap, which does the conversion itself in code we control.
What are the valid use cases for this other than mmap?
Mostly cgo: if you call a C function that returns a uintptr
and you know that it is the address of something in the C heap, you should be able to convert that uintptr
to the corresponding Go pointer type.
It also appears to be needed for a couple of other system calls that allocate memory in the caller's address space when they are invoked through syscall.Syscall
or equivalent, such as shmat(2)
on POSIX platforms, or LocalAlloc
or LockResource
on Windows.
There are also a number of Windows C libraries that are loaded and called dynamically, such as GetSidIdentifierAuthority
in x/sys/windows
.
There are also some uses in x/sys
that convert uintptr
arguments passed to Windows callbacks, such as in ctlHandler
. It's not obvious to me whether those could be replaced with typed pointer arguments.
Finally, these conversions could be used for relocations on platforms with predictable memory layouts (such as the github.com/ebitengine/purego
usage reported in #56487), and for machines with memory-mapped I/O devices (perhaps certain embedded devices, particularly for ARM and MIPS?).
It is unclear how the runtime can actually guarantee this - there are difficult implementation questions to eliminate all possible races.
I think that was why @ianlancetaylor suggested that we not require the address to remain valid (see https://github.com/golang/go/issues/58625#issuecomment-1439216972).
I agree that there are some details to work out to avoid spurious bad pointer in Go heap
failures during GC, but as far as I can tell, to the extent that that is a problem it is already a problem today, given how many of the system call wrappers in x/sys/unix
and x/sys/windows
already do this sort of conversion. 😅
Change https://go.dev/cl/492415 mentions this issue: windows: use unsafe.Add instead of pointer arithmetic on a uintptr
CL 465235 shows the existing call sites in x/sys
that are currently flagged by go vet
and would become valid under this proposal. Most are Windows system calls.
There are at least three mostly independent issues here:
Do we want to define rules so that packages like syscall are valid under the rules, or do we just say they're fine because they are shipped with the Go toolchain?
Can we make the unsafeptr vet check precise enough to enable by default? (If (1) is no, we could leave it disabled just for the standard library.)
Are there lurking bugs in the runtime around use of mmap/munmap and races with the Go allocator? (Maybe but not proposal material, just bug fixes.)
For this proposal, (1) and (2) are what matter.
For (1), it's unclear what external code needs the same kind of guarantees and low-level work, especially since we have syscall.Mmap already (which returns a slice, not a uintptr, so callers don't need a uintptr->pointer conversion). What code is legitimately converting uintptrs to unsafe.Pointers today? We don't really know. Possibly many of the Windows DLL calls?
For (2), to date we don't think the unsafeptr check is precise enough to enable during 'go test' for everyone, even outside std. And a change to the rules would make it basically never possible to trigger. So maybe the answer to (2) is definitely not, and we should delete the unsafeptr check.
Do we want to define rules so that packages like syscall are valid under the rules, or do we just say they're fine because they are shipped with the Go toolchain?
The version of x/sys/windows
in use in a user program is not necessarily the one shipped with the toolchain. I agree that in general it's ok for, say, runtime
to violate the usual rules, but I'm not sure that I could coherently extend that to x/sys
.
I think it should be possible to use x/sys
with other Go implementations (like gccgo
and TinyGo) without making assumptions beyond what we document for external Go users.
What code is legitimately converting uintptrs to unsafe.Pointers today? We don't really know. Possibly many of the Windows DLL calls?
The examples in x/sys
in https://go.dev/cl/465235 skew heavily Windows, and I would expect that to be true in the outside world as well (CC @golang/windows).
That said, when I searched GitHub for uses of mkwinsyscall
and spot-checked the results, I wasn't able to find any wrappers outside of std
and x/sys
that perform uintptr
-to-unsafe.Pointer
conversions. But there could also be some survivorship bias there. 🤔
especially since we have syscall.Mmap already (which returns a slice, not a uintptr, so callers don't need a uintptr->pointer conversion).
I actually wrote my own mmap(2)
wrappers for commonly used platforms to avoid the overhead of mmapper
in syscall
and x/sys/unix
.
It's not clear to me why we have mmapper
. Users of syscall.Mmap
and unix.Mmap
are already writing unsafe platform-specific code. This little bit of "extra protection" is unlikely to be useful. In fact, I'd go as far as saying the kind of misuse it allows is just buggy code.
I'd happily switch away from my own mmap(2)
wrappers if mmapper
is removed from syscall
and x/sys/unix
, or a new exported function for mmap(2)
without mmapper
is added.
@database64128 what is the amount of overhead you are seeing from the mmapper? Are you concerned about the lock or the map operations? When I wrote that I expected Mmap calls would be infrequent, certainly not often enough to make a trivial lock+map+unlock a bottleneck.
Also, the point of syscall.Mmap is that it avoids unsafe entirely. You don't need to import unsafe at all to use it, and if you don't import unsafe, there is no possibility of misuse.
I also can't use syscall.Mmap
because mmapper
is incompatible with mremap
, which I use to resize an existing mmap
.
@aaronbee thanks, added #60409
what is the amount of overhead you are seeing from the mmapper? Are you concerned about the lock or the map operations? When I wrote that I expected Mmap calls would be infrequent, certainly not often enough to make a trivial lock+map+unlock a bottleneck.
Thank you for asking these follow-up questions.
Yes, you're right that mmap
calls are usually infrequent and mmapper
is unlikely to become a bottleneck. So I completely understand if you decide to keep mmapper
.
I initially wrote my own mmap
wrappers because:
mmapper
code and really disliked the approach. Just because we could afford a few extra sequentially consistent atomic operations, doesn't mean we should.mmapper
can be trivially bypassed by reslicing like a[low : high : max]
, which I do use in part of my code.mmapper
's behavior in my own code.mmap
, and the best way seemed to be implementing my own syscall wrappers.munmap
actually allows you to unmap only part of a mapped address space. Then theoretically you could fill that space again by mmap
ing other stuff into it. This is not supported by mmapper
.It seems like we should try to make the unsafeptr check accurate, perhaps by ignoring uintptr conversions made on results from syscall.Syscall and maybe also x/sys/windows.Proc.Call (that wraps syscall.Syscall so maybe it's not a special case if we flow information up like we do in the printf checker). If we can make unsafeptr accurate, then that can be the rules we write down too. And I think we'd just say runtime is very special and never going to meet the unsafeptr rules and just have unsafeptr disable itself on runtime.
Does anyone want to try to improve the unsafeptr precision?
Here are the results of running the unsafeptr check on the latest version of all public modules stored on proxy.golang.org that are required by >= 1 other modules (this filters out many forks). https://gist.github.com/rsc/d56cf40e010112f67339340101f92f17
A few observations:
//goland:noinspection GoVetUnsafePointer
annotation that I assume silences the unsafeptr check in GoLand.Seeing these results, it seems to me that fixing the unsafeptr check to be precise is a hopeless cause. We simply cannot decide statically.
However, among the many false positives, there are real problems unsafeptr turns up, like converting an integer off the network to a channel or even subtle misuses of reflect that store uintptr-typed pointers onto the stack where the GC will not see them. We want to preserve some way to find and report these. We have a way to do this today, namely -d=checkptr=1, which checks unsafe.Pointer(u) conversions and is enabled automatically in -race and -msan builds. Because it is a dynamic check, it can permit code to synthesize non-Go pointers but then block the exact same code and call stack from synthesizing a Go pointer.
If we can make that unsafe.Pointer(u) check very cheap, say under 10X the cost of an array bounds check, then perhaps it would make sense to enable it always, not just with -d=checkptr=1. Based on discussion with @mknyszek, @mdempsky, @aclements, and others, it seems like we probably can make the check cheap enough to run in ordinary production use. I think we should explore that. Specifically, I suggest:
For this proposal issue, I think all we need to agree on is (1) and (2) in my previous comment, namely the additions to the spec's rules for unsafe.
Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group
In particular x := (*[1<<63-1]byte)(unsafe.Pointer(1)) is OK and i := uintptr(unsafe.Pointer(&z)) is OK, but x[i-1] is not.
I'm not sure I understand this part, or at least how we're going to implement it. x[i-1]
(syntactically) doesn't include an unsafe conversion. How are we going to disallow it (both in the spec and at run time)?
Or should we disallow conversion like x := (*[1<<63-1]byte)(unsafe.Pointer(1))
if it overlaps with Go memory?
Or should we disallow conversion like
x := (*[1<<63-1]byte)(unsafe.Pointer(1))
if it overlaps with Go memory?
I think we should do that. (I thought -d=checkptr
already flagged that today, in the converted pointer straddles multiple allocations
check?)
In general we can't implement it. We can implement the specific case of x := (*[1<<63-1]byte)(unsafe.Pointer(1))
but I don't think we're going to instrument unsafe.Add(p, N). So if you did x := (*byte)(unsafe.Pointer(1)) and unsafe.Add(x, Ptr-1) I don't think we're going to catch that.
I agree that we shouldn't instrument unsafe.Add
by default, but maybe it would be reasonable to do in -d=checkptr
mode?
- In the spec's unsafe.Pointer rules, add a new rule allowing any of the existing uintptr->unsafe.Pointer cases to include the addition of an offset within the same allocation, addressing a common false positive case above.
If I understand correctly, this means that if unsafe.Add(unsafe.Pointer(x), 1)
is valid today, then users would be able to write it as unsafe.Pointer(x + 1)
instead? That seems reasonable to me.
Caveat: This isn't actually always safe today with cmd/compile. For example, compiling https://go.dev/play/p/SWa3KGxqKPk with -live
shows a live temporary at line 18, but not 11. I think that should be easy to fix though.
[2.] In particular x := (*[1<<63-1]byte)(unsafe.Pointer(1)) is OK and i := uintptr(unsafe.Pointer(&z)) is OK, but x[i-1] is not.
Can you elaborate what you mean that x[i-1]
is not okay? Specifically, what obligations are on the implementation to provide when compiling that expression? E.g., is the compiler required to conservatively assume x[i-1]
and &z
may alias?
Note: An expression like uintptr(unsafe.Pointer(&z))
doesn't force z
to be heap allocated. If it's stack allocated and the stack moves between assigning i
and evaluating x[i-1]
, they may not alias. If this is an issue, we could force z
to the heap when we lose track of what happens to the uintptr
result so that it has a stable address.
[3.] Improve the speed of -d=checkptr=1 to the point where it can be enabled at minimal cost in production code.
Do you have specific expectations of what -d=checkptr=1
should/must catch? Or is it just whatever can be done without false positives and at extremely low overhead?
No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group
This ended up in the compiler/runtime triage queue because the proposal was accepted, but we have some questions about what exactly the actions here are.
I think we're clear on what has to happen for (1), but we're less clear on the requirements for the rest (i.e. how is the compiler implementation restricted, and what should -d=checkptr=1
be looking for exactly? Or is it as simple as "whatever it can with low overhead"? (@mdempsky asked about both of these above.)
If I understand correctly, this means that if unsafe.Add(unsafe.Pointer(x), 1) is valid today, then users would be able to write it as unsafe.Pointer(x + 1) instead? That seems reasonable to me.
Caveat: This isn't actually always safe today with cmd/compile. For example, compiling https://go.dev/play/p/SWa3KGxqKPk with -live shows a live temporary at line 18, but not 11. I think that should be easy to fix though.
SGTM thanks.
[2.] In particular x := (*[1<<63-1]byte)(unsafe.Pointer(1)) is OK and i := uintptr(unsafe.Pointer(&z)) is OK, but x[i-1] is not.
Can you elaborate what you mean that x[i-1] is not okay? Specifically, what obligations are on the implementation to provide when compiling that expression? E.g., is the compiler required to conservatively assume x[i-1] and &z may alias?
By "not OK" here I just meant it's an invalid program. I did not mean that the compiler has to catch it or enable the runtime to catch it. We could imagine disallowing the (*[1<<63-1]byte)(unsafe.Pointer(1)) because it spans the Go address space, but I don't think that's terribly useful since you can still do
x := unsafe.Pointer(1) // OK
i := uintptr(unsafe.Pointer(&z)) // OK
y := unsafe.Add(x, i-1) // oops created a Go pointer
This program is invalid but I don't think we should instrument unsafe.Add to look for transitions from "not Go" argument to "Go" result. unsafe.Add should probably still stay a simple add instruction. We can't actually catch all misuse of unsafe. If someone is determined, they will find a way to write something that defeats checkptr. For example the checkptr implementation is not going to catch you doing:
var x unsafe.Pointer
*(*uint64)(unsafe.Pointer(&x)) = myInteger
use(x)
which probably works most of the time. :-)
Note: An expression like uintptr(unsafe.Pointer(&z)) doesn't force z to be heap allocated. If it's stack allocated and the stack moves between assigning i and evaluating x[i-1], they may not alias. If this is an issue, we could force z to the heap when we lose track of what happens to the uintptr result so that it has a stable address.
Understood. It's not an issue because it's not a valid program.
[3.] Improve the speed of -d=checkptr=1 to the point where it can be enabled at minimal cost in production code.
Do you have specific expectations of what -d=checkptr=1 should/must catch? Or is it just whatever can be done without false positives and at extremely low overhead?
The main thing is to catch honest mistakes, like:
u := reflect.ValueOf(p).Pointer()
y := (*T)(unsafe.Pointer(u))
or similar code with struct offsets added to u, say. (The mistake is that u has type uintptr so it's not going to be updated if p moves or stop p from being GC'ed.)
Whatever can be done without false positives and at extremely low overhead sounds good. -d=checkptr=1 can still do more if there's something important to keep doing with more overhead. The low overhead one would be enabled for -d=checkptr=0.
Background
The documentation for
unsafe.Pointer
currently lists six valid conversion patterns:“Conversion of a
*T1
toPointer
to*T2
… [p]rovided thatT2
is no larger thanT1
and that the two share an equivalent memory layout.”“Conversion of a
Pointer
to auintptr
(but not back toPointer
).”“Conversion of a
Pointer
to auintptr
and back, with arithmetic … in the same expression, with only the intervening arithmetic between them.”“Conversion of a
Pointer
to auintptr
when callingsyscall.Syscall
… [or] in the argument list of a call to a function implemented in assembly.” (Compare #34684.)“Conversion of the result of
reflect.Value.Pointer
orreflect.Value.UnsafeAddr
fromuintptr
toPointer
… immediately after making the call, in the same expression.”“Conversion of a
reflect.SliceHeader
orreflect.StringHeader
Data
field to or fromPointer
… when interpreting the content of an actual slice or string value.”The
unsafeptr
check provided bycmd/vet
warns about uses that do not follow the above patterns. However, it currently flags many violations inx/sys/unix
(see #41205) when addresses returned by system calls such asmmap
are converted to Go pointers, and in code generated by ebitengine/purego (see #56487) when hard-coded addresses provided by the operating system or linker are converted to Go pointers. In both cases, the program is attempting to create Go pointers that refer to known-valid addresses that are not managed by the Go runtime; notably, the compiler's-d=checkptr
mode does not flag them as invalid at run-time.Ideally, the
unsafe.Pointer
documentation, theunsafeptr
check incmd/vet
, the-d=checkptr
mode incmd/compile
, and the real-world usage insyscall
,x/sys
, and similar low-level libraries should all agree on what is valid. This proposal aims to narrow that gap.Proposal
I propose that we add another allowed case in the
unsafe.Pointer
documentation:The
unsafeptr
check incmd/vet
would be changed to allow the new case, resolving the warnings inx/sys
andebitengine/purego
.The compiler's
-d=checkptr
mode would check conversions fromuintptr
tounsafe.Pointer
. The conversion may throw at run-time if:uintptr
refers to an address managed by Go that is not explicitly pinned (#46787), orthe[edit: per https://github.com/golang/go/issues/58625#issuecomment-1440188404, unmapped addresses should be allowed as long as they cannot become Go addresses]uintptr
refers to an invalid (unmapped) nonzero address.Alternatives
The
vet
warning can be worked around today by relying on a liberal reading of the “equivalent memory layout” rule, rewriting(https://go.dev/play/p/ZWZxv6URqTW) as
(https://go.dev/play/p/Wh5f8k0_yyR), on the theory that the memory layout of a
uintptr
must be in some sense “equivalent” to the memory layout ofunsafe.Pointer
. However, I believe that such a rewrite does not capture the intent of the code as accurately, and should not be necessary.(CC @golang/runtime)