golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.22k stars 17.7k forks source link

cmd/link: lock down future uses of linkname #67401

Closed rsc closed 5 months ago

rsc commented 6 months ago

Overuse of //go:linkname to reach into Go standard library internals (especially runtime internals) means that when we do change the standard library internals in ways that should not matter, we can end up breaking packages that are depended on by a large swath of the Go ecosystem. For example, https://go.dev/cl/583756 broke github.com/goccy/go-json because it turns out that package copied most of the runtime's internal type API. Now we can't change anything in that list, despite that being an ostensibly internal package, without breaking goccy/go-json. And goccy is used by many packages, including Kubernetes.

This situation is unsustainable. Internals are internal for a reason. We can't keep Go programs working when they create explicit dependencies on details that we have kept internal. But we also care a lot about compatibility: we don't want to break Go programs either. The obvious conclusion is that we have to stop Go programs from being able to create these dependencies on internal details in the first place.

This issue tracks work to prevent new //go:linkname-based dependencies and contain existing ones.

Right now, if package A has a symbol and package B wants to refer to it with //go:linkname, there are three patterns:

The ideal goal state is a world where all //go:linkname usage must be in the Handshake form: both sides must agree to use linkname for a given symbol in order for it to succeed. This will mean that arbitrary packages cannot create new dependencies on runtime internals. At the same time, we realize that the current world is not this ideal world, and we don't want to break all existing uses.

Our plan is as follows.

  1. Introduce a new -checklinkname=1 flag to cmd/link that requires the Handshake form for symbols in the standard library. That flag is already landed in at tip, but it is not the default.

  2. Survey all existing open-source Go packages to find standard library symbols that are being //go:linkname'd (behind our backs!) using the Pull pattern. Add the necessary //go:linkname annotations to the standard library to keep those working, documenting why each exists. The explicit //go:linkname lines and documentation will help avoid accidental breakage in future refactoring. We have done a preliminary survey, but we haven't yet added all the necessary //go:linkname lines.

  3. Make -checklinkname=1 the default for Go 1.23. If this breaks anything, users can use -ldflags=-checklinkname=0 to get unbroken, and we hope they will also file reports letting us know what we missed.

  4. As we get reports of additional breakage we missed, add more //go:linkname annotations to the standard library.

At the completion of that plan, we won't be in the ideal world, but we will have accomplished two important things:

Note that anyone who wants to experiment can always build with -ldflags=-checklinkname=0 and linkname whatever they like. That's fine. We like experimenting too. But the fact that the code won't build without special flags should help prevent code that digs into internal details from becoming a core dependency in the Go ecosystem that we end up having to maintain forever.

Note also that for now, //go:linkname can still be used in Pull mode to get at internals of non-standard library packages. We'd like to change that eventually too, insisting on Handshakes everywhere. For now, we are starting with the standard library. If all goes well, we'll circle back and try to devise a plan for the rest of the ecosystem.

gopherbot commented 6 months ago

Change https://go.dev/cl/585820 mentions this issue: runtime/coverage: remove uses of //go:linkname

Jorropo commented 6 months ago

I think github.com/goccy/go-json pulled internal runtime APIs in a reckless manner and there are safe ways to pull internal packages from the std. For example quic-go did an equally dangerous thing with crypto/tls by accessing private fields using unsafe.Pointer and maintaining forks of crypto/tls:

However key differences:

As a downstream of quic-go I clearly understood the situation, the worst was the need to wait a couple of days for quic-go to release a new release compatible with the new go release before updating my go toolchain and the inability to run on tip or RCs.


In the case of github.com/goccy/go-json the situation could have been even better than quic-go's as it claims to be compatible with encoder/json:

Fast JSON encoder/decoder compatible with encoding/json for Go

instead of creating a compile time error they could have stubbed their API by forwarding calls to encoding/json when the release of go were to be unknown. (I don't know the details, maybe due to some edge cases or behaviors only go-json implements this wouldn't have been possible)


What I actually propose:

Continue to allow the Pull kind of linkname from std packages when the file is locked with build tags:

//go:build go1.23.4 && !go1.23.5 && !go1.24

To know if a file is properly locked down, the toolchain can evaluate the build tags with the next future release and it MUST fail to be allowed. That means if a file pass the current version but not the next one, then Pull linknames from the std would be allowed.

There is a downside to this approach which is that if there is no change required between two releases you still need a different file per version with the same implementation, to satisfy this constraint. Maybe parsing the build tags would be better.

I am not sure if goX.Y or goX.Y.Z should be used. goX.Y might be more dangerous in case a fix require breaking some internal API, quic-go used that and it was fine and created this couple of days period where you can't use latest go only every 6 months (note that goX.Y.Z build tags didn't existed back then).

rsc commented 6 months ago

I think ... there are safe ways to pull internal packages from the std.

I completely disagree. Quic-go's use of linkname caused all manner of problems for us release after release too, because anyone using quic-go couldn't update to a new Go version until quic-go did.

gopherbot commented 6 months ago

Change https://go.dev/cl/585916 mentions this issue: internal/coverage/cfile: remove //go:linkname into testing

gopherbot commented 6 months ago

Change https://go.dev/cl/585915 mentions this issue: internal/coverage/cfile: remove more //go:linkname usage

ericlagergren commented 6 months ago

Survey all existing open-source Go packages to find standard library symbols

What about closed source packages?

randall77 commented 6 months ago

We can't fix what we don't know about.

  1. As we get reports of additional breakage we missed, add more //go:linkname annotations to the standard library.

We're open to any reports from closed-source packages. It would be particularly useful to hear about these when the release candidate comes out so they can make the .0 release.

ericlagergren commented 6 months ago

It would be nice to have a flag (maybe in -ldflags?) to allow "pull" style //go:linkname.

I know Go eschews knobs and flags, but closed source projects aren't as big of a problem as open source projects because any breakage is self-contained. And a flag won't be used much by open source projects because it's abnormal for Go packages to require setting specific compiler/linker flags.

I do think this proposal is a good idea, though.

randall77 commented 6 months ago

It would be nice to have a flag (maybe in -ldflags?) to allow "pull" style //go:linkname.

The proposal says:

Note that anyone who wants to experiment can always build with -ldflags=-checklinkname=0 and linkname whatever they like.

ericlagergren commented 6 months ago

Oops, sorry :)

ruyi789 commented 6 months ago

Misuse is undesirable, disabling is even worse, you want to change you maintain that package (go-json), that doesn't solve it?

bjorndm commented 6 months ago

I would like to say that //go:linkname is very useful for certain types of low level programming such as Ebitengine , etc. While I agree the situation with goccy/go-json is bad, we should look at how it is being used in detail and think of alternatives for the legitimate uses.

DmitriyMV commented 6 months ago

@bjorndm should't -ldflags=-checklinkname=0 which disables this check be enough for low level programming? And if you can always raise a proposal if something is missing and truly needed without ld flags.

As we get reports of additional breakage we missed, add more //go:linkname annotations to the standard library.

cherrymui commented 6 months ago

In C, static symbols are not accessible outside of the compilation unit, full stop. There is no way to pull a static symbol from a C library. A number of other languages have similar strict visibility rules. They are very successful languages and are widely used. This suggests that a lot programs can be written and things can go very well without a mechanism to break into a library's internal details.

I don't think Go is fundamentally different. Ideally we could also have strict visibility rules. I would think Go unexported symbols are meant to be similar to C static symbols. In fact, that is what gccgo does. Unfortunately for the gc toolchain it is not the case today. But we can get closer to it. And as we care a lot about compatibility, we'll keep the existing code continue to build in Go 1.23 (Step 2 in the plan). And we have a linker flag to disable the restriction (e.g. for experiments; as far as I know, the C linker doesn't seem to have such an option).

Also, I think the authors of the code should have a way to decide which symbols are visible externally and which are not.

TotallyGamerJet commented 6 months ago

Purego which is a dependency of Oto, Beep, and Ebitengine as well as others doesn't just pull symbols it also pushes since it reimplements runtime/cgo package when CGO_ENABLED=0 entirely in Go. Is there any way the symbols defined in the runtime that hook into that package also get comments to avoid breaking us?

Another potential solution for us is to prebuild runtime/cgo into a cgo_GOOS_GOARCH_GOVERSION.syso that ships with purego. That would save us from having to keep up with any changes that package has and allows the Go team the freedom to change it as they like. Is this actually possible? I tried with setting different -buildmode but none of them would link.

Of course, if the Go team wanted to port runtime/cgo to Go that would be optimal.

cherrymui commented 6 months ago

Is there any way the symbols defined in the runtime that hook into that package also get comments to avoid breaking us?

Push linknames are still allowed. If they are currently pushed from runtime/cgo, I believe you can still push them from Purego. Does Purego push symbols more than runtime/cgo?

Of course, if the Go team wanted to port runtime/cgo to Go that would be optimal.

This might be a possible option, but I think we need to understand the rationales better. The runtime/cgo package is intended to work with cgo, that is, interacting with C code. I'm not sure I understand the use case of runtime/cgo with CGO_ENABLED=0. This is probably better to be a separate discussion. Thanks.

TotallyGamerJet commented 6 months ago

Push linknames are still allowed. If they are currently pushed from runtime/cgo, I believe you can still push them from Purego. Does Purego push symbols more than runtime/cgo?

No, Purego pushes the same symbols that runtime/cgo does which means the "Handshake" is already satisfied for those. Step 2 of the suggested plan only mentions surveying for Pull linknames and marking them with comments to avoid future breakage. I'm wondering if there is any plans for Pushes as changes to those would break Purego?

This might be a possible option, but I think we need to understand the rationales better. The runtime/cgo package is intended to work with cgo, that is, interacting with C code. I'm not sure I understand the use case of runtime/cgo with CGO_ENABLED=0. This is probably better to be a separate discussion. Thanks.

Indeed, runtime/cgo is required to allow Go code and C code to play well with each other. Purego provides an entirely Go version of it so that you can call C code using purego.Dlopen and purego.Dlsym without the need of a C compiler so cross-compiling is again possible. We can discuss this further elsewhere.

cherrymui commented 6 months ago

Thanks.

I'm wondering if there is any plans for Pushes as changes to those would break Purego?

I don't think there is any plan to break the use case of Purego's pushes. If we do anything to restrict push-only ones, they will be equally applied to the ones runtime/cgo pushing to runtime. So we'll need to fix those first (I think many of them are already in handshake form, but it is possible we missed some). And that should make Purego work as well.

gopherbot commented 6 months ago

Change https://go.dev/cl/586259 mentions this issue: runtime: move exit hooks into internal/runtime/exithook

gopherbot commented 6 months ago

Change https://go.dev/cl/586137 mentions this issue: all: add push linknames to allow legacy pull linknames

gopherbot commented 6 months ago

Change https://go.dev/cl/586476 mentions this issue: runtime: stop external packages from using typelinks

gopherbot commented 6 months ago

Change https://go.dev/cl/585556 mentions this issue: cmd/link: enable checklinkname by default

database64128 commented 6 months ago

tfo-go implements TCP Fast Open support. It provides a set of APIs similar to the ones in net, with even the same underlying concrete types (*net.TCPConn, *net.TCPListener) as return values.

In order to do so, it has to tap into the net package to learn about the preferred address family, listener backlog, etc. On Windows, it has to call unexported functions in netpoll, because syscall.RawConn cannot be used for CONNECTEX calls.

Here's the list of pull linknames in this project. Can I open a CL to add them to the standard library?

// tfo-go/netpoll_windows.go

//go:linkname sockaddrToTCP net.sockaddrToTCP
//go:linkname execIO internal/poll.execIO
//go:linkname newFD net.newFD
//go:linkname netFDInit net.(*netFD).init
//go:linkname netFDClose net.(*netFD).Close
//go:linkname netFDCtrlNetwork net.(*netFD).ctrlNetwork
//go:linkname netFDWrite net.(*netFD).Write
//go:linkname netFDSetWriteDeadline net.(*netFD).SetWriteDeadline
//go:linkname rawConnControl net.(*rawConn).Control
//go:linkname rawConnRead net.(*rawConn).Read
//go:linkname rawConnWrite net.(*rawConn).Write

// tfo-go/sockopt_linux.go

//go:linkname listenerBacklog net.listenerBacklog

// tfo-go/tfo_supported.go

//go:linkname ipToSockaddr net.ipToSockaddr
//go:linkname loopbackIP net.loopbackIP
//go:linkname favoriteAddrFamily net.favoriteAddrFamily
philpearl commented 6 months ago

I've used go:linkname in quite a few places, I think usually to do reflection-type things that cause unnecessary allocations if you use the reflect package. I always understood the code might break with changes to Go. That would not upset me. I'd be much more upset by losing the mechanism to use these kinds of backdoors when it seems necessary.

ianlancetaylor commented 6 months ago

@philpearl You can continue to do what you want with your own projects by using the new linker flag. But if you release a package that relies on linknaming, and people start using that package, that becomes a problem for the overall Go ecosystem. We really don't want to break popular packages. If they use linkname we have to make a difficult choice between breaking a popular package or slowing down or even stopping ongoing Go development. Prohibiting new uses of linkname seems like a good path forward.

philpearl commented 6 months ago

Thanks for the response! I hadn't realised that the new flag would be permanent rather than a stop-gap until people had updated their code. (I do have some packages that use linkname, but I'd be shocked if they ever become popular!)

The two reasons I ever used linkname were, I believe

In the second case I decided it was better to use linkname than copy assembly code to get a fast hash function.

I'm afraid I've not recently checked whether these performance issues still exist - but I'd happily remove all use of linkname if I could get the performance I need (which is mostly just no unnecessary allocations) from using the standard library.

cherrymui commented 6 months ago

I think the linker flag will probably stay.

performance issues with reflect

We have been (slowly) improving the performance of reflect. If there is a performance issue that is not already known, feel free to open an issue so we can investigate.

no access to hash functions except via an interface

maphash.Comparable (#54670) might help?

kortschak commented 6 months ago

I now had a chance to look at the impact that this will have on Gonum graph packages after work that Ian did. Initially, I was optimistic that the changes that he made in the iterator package and supporting proposed changes in reflect would allow us to maintain reasonable performance. This is not the case. In primitives I see ~30% increases in CPU cost and 1000% to 10,000% increases in allocs (10% to 1000% increases in allocation volume). This makes the Gonum graph packages untenable.

gopherbot commented 6 months ago

Change https://go.dev/cl/586896 mentions this issue: runtime: push vdsoClockgettimeSym linkname on linux/arm64

rsc commented 6 months ago

I ran a scan against an updated copy of my Go corpus, filtered to packages with at least N direct+indirect dependents. Here are the results for varying N. Each file cuts off where the next file begins, so for example linkname100.html only shows linknames from modules with >= 100 but < 200 dependents.

https://swtch.com/tmp/linkname1.html https://swtch.com/tmp/linkname100.html https://swtch.com/tmp/linkname200.html https://swtch.com/tmp/linkname500.html https://swtch.com/tmp/linkname1000.html https://swtch.com/tmp/linkname2000.html https://swtch.com/tmp/linkname5000.html https://swtch.com/tmp/linkname10000.html https://swtch.com/tmp/linkname20000.html https://swtch.com/tmp/linkname50000.html https://swtch.com/tmp/linkname100000.html

I will send some CLs adding the linknames I found, starting with the largest number of dependents, and we can decide where exactly we want to stop.

rsc commented 6 months ago

@database64128 According to deps.dev, almost nothing depends on tfo-go (v1 v2) yet. I'd rather work with you to figure out a linkname-free way to do what you need to do than add those linknames. It's suspicious to be linknaming exported methods, since you can just call those methods. Can you open a separate bug for tfo-go's linknames and we can discuss there? Thanks!

rsc commented 6 months ago

@kortschak Did you mean to comment on a different bug? This one is about locking down //go:linkname.

cespare commented 6 months ago

@rsc #66125 and https://github.com/gonum/gonum/pull/1937 describe the performance problems that gonum uses unsafe linkname-ing to work around. (Some code is here.)

database64128 commented 6 months ago

According to deps.dev, almost nothing depends on tfo-go (v1 v2) yet.

There are forks that are dependencies of projects with over 10k stars. They forked the project because I raised the minimum supported Go version to 1.21 but they want to support as old as 1.18.

It's suspicious to be linknaming exported methods, since you can just call those methods.

These are exported methods on unexported structs (net.(*netFD) and net.(*rawConn)).

I'd rather work with you to figure out a linkname-free way to do what you need to do than add those linknames.

Can you open a separate bug for tfo-go's linknames and we can discuss there?

Thanks, I'll open an issue later this week. But I'm afraid the whole process is going to take some time, and right now these projects need the linknames to be able to build with gotip.

rsc commented 6 months ago

GitHub stars are not necessarily important. A starred but not-depended-on project will not cause much breakage (unless the dependent counts are wrong?). No projects need to build with gotip (and main packages can use the flag).

My focus right now is on projects with many dependents. I will circle back to tfo-go.

rsc commented 6 months ago

@cespare and @kortschak, I have a pending CL that will re-enable linkname of mapiterkey, mapiterelem, mapiternext, which I see gonum uses.

kortschak commented 6 months ago

@rsc Thank you so much for that. I would love to move off the use that we have, and maybe that will be possible in the future, but at the moment, it is unfortunately not.

rsc commented 6 months ago

@kortschak Completely understood. We just missed gonum's usage when preparing the initial list of users. I am working with a more complete list now.

rsc commented 6 months ago

Reopening for adding the linknames I found in my more complete scan.

rsc commented 6 months ago

I just mailed CLs to catch the linknames in all modules with >= 1000 dependents. I will take a closer look at

https://swtch.com/tmp/linkname1.html https://swtch.com/tmp/linkname100.html https://swtch.com/tmp/linkname200.html https://swtch.com/tmp/linkname500.html

later to see if we should lower the bar or cherry-pick specific ones. (Bazel-gazelle's are probably worth doing, for example.)

gopherbot commented 6 months ago

Change https://go.dev/cl/587216 mentions this issue: runtime: revert "move zeroVal to internal/abi"

gopherbot commented 6 months ago

Change https://go.dev/cl/587218 mentions this issue: all: document legacy //go:linkname for modules with ≥20,000 dependents

gopherbot commented 6 months ago

Change https://go.dev/cl/587219 mentions this issue: all: document legacy //go:linkname for modules with ≥10,000 dependents

gopherbot commented 6 months ago

Change https://go.dev/cl/587217 mentions this issue: all: document legacy //go:linkname for modules with ≥50,000 dependents

gopherbot commented 6 months ago

Change https://go.dev/cl/587220 mentions this issue: all: document legacy //go:linkname for modules with ≥5,000 dependents

gopherbot commented 6 months ago

Change https://go.dev/cl/587215 mentions this issue: all: document legacy //go:linkname for modules with ≥100,000 dependents

gopherbot commented 6 months ago

Change https://go.dev/cl/587222 mentions this issue: all: document legacy //go:linkname for modules with ≥1,000 dependents

gopherbot commented 6 months ago

Change https://go.dev/cl/587221 mentions this issue: all: document legacy //go:linkname for modules with ≥2,000 dependents

thaJeztah commented 6 months ago

@rsc as you're making your way through the list, but not sure if it shows up in "volumes" w.r.t. Go dependent; looks like this broke OCI runc; more details / discussion here;

cc @lifubang @kolyshkin

anthonywong1221 commented 6 months ago

Although I don't believe using //go:linkname is a good practice, labeling widely-used libraries as 'members of the hall of shame' is childish and not a constructive way to address this issue.