golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.38k stars 17.71k forks source link

runtime: provide a way to opt in to stack trace printing for process-terminating signals in secure-mode binaries on Unix OSes #69868

Open cespare opened 1 month ago

cespare commented 1 month ago

Go version

go version go1.23.1 linux/amd64

Output of go env in your module/workspace:

n/a

What did you do?

Consider two trivial go programs. First, sleep.go:

//go:build ignore

package main

import "time"

func main() {
    time.Sleep(time.Hour)
}

Second, panic.go:

//go:build ignore

package main

func main() {
    panic("asdf")
}

First, run sleep.go and then interrupt it by hitting C-\ (SIGQUIT):

$ rm -f sleep && go build -o sleep sleep.go && ./sleep
^\SIGQUIT: quit
PC=0x46c46e m=0 sigcode=128

goroutine 0 gp=0x50da00 m=0 mp=0x50e660 [idle]:
... lots of stack trace elided...
rflags 0x246
cs     0x33

Next, run panic.go in the same way:

$ rm -f panic && go build -o panic panic.go && ./panic
panic: asdf

goroutine 1 [running]:
main.main()
        /home/caleb/w/_scratch/x/panic.go:6 +0x25

All normal stuff. Now let's do the same thing but with a binary that has CAP_SYS_ADMIN:

$ rm -f sleep && go build -o sleep sleep.go && sudo setcap 'cap_sys_admin+ep' ./sleep && ./sleep
^\SIGQUIT: quit
$ rm -f panic && go build -o panic panic.go && sudo setcap 'cap_sys_admin+ep' ./panic && ./panic
panic: asdf

In both cases (SIGQUIT and a normal panic), printing stack traces is suppressed by the Go runtime. This is a security measure implemented last year in #60272.

If I want to bypass this stack trace suppression anyway, I can do so for panics by calling runtime/debug.SetTraceback. For example, If I add the line

debug.SetTraceback("single")

to panic.go, then it restores the normal unprivileged panic stack trace printing behavior:

$ rm -f panic && go build -o panic panic.go && sudo setcap 'cap_sys_admin+ep' ./panic && ./panic
panic: asdf

goroutine 1 [running]:
main.main()
        /home/caleb/w/_scratch/x/panic.go:9 +0x32

However, as far as I know, there is no equivalent way to opt in to the default signal-handling stack trace printing behavior. If I want to print stack traces on SIGQUIT, I'd have to listen for the signal and do it myself.

What did you see happen?

No way to turn on stack trace printing for process-terminating signals for this CAP_SYS_ADMIN binary.

What did you expect to see?

I believe there should be a way to ask the runtime to print stack traces on SIGQUIT when running in secure mode. One way to to this would be to piggyback on debug.SetTraceback and restore the signal stack trace printing behavior when SetTraceback is called with some value other than "none" (or perhaps only when called with "all" or "single", or perhaps only when called with "all").

Alternatively, we can provide a new API such as runtime/debug.SetSignalTraceback(bool).

cc:

gabyhelp commented 1 month ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

cespare commented 1 month ago

68103 is an interesting related bug. It looks like the debug.SetTraceback trick only works for panics and not fatal errors.

ianlancetaylor commented 1 month ago

CC @golang/runtime @golang/security

rolandshoemaker commented 1 month ago

I'm somewhat concerned about all of a sudden giving debug.SetTraceback security properties which it previously didn't. Everyone who has set a non-default traceback value will, all of a sudden, have disabled a security feature that may not be aware of.

In general, letting this be a switch set in code is somewhat complicated from a security perspective. What the developer wants and expects is likely to be somewhat different from what the person deploying the binary wants/expects, especially if they are using the various capability restriction/elevation features. In particular I'm not sure if there is really a way to make it clear to users whether the developer has called debug.SetTraceback, and as such if potentially sensitive data will be elided or not.

cespare commented 1 month ago

It's currently the case today that debug.SetTraceback lets the user disable (one part of) this security measure. People who wrote code before 2023 that called debug.SetTraceback never got the security measure insofar as it applies to internal panics. So we're already in the world you're talking about.

What the developer wants and expects is likely to be somewhat different from what the person deploying the binary wants/expects,

I'm definitely sympathetic to this idea. However, the change in #60272 causes the program to completely ignore GOTRACEBACK (rather than, say, setting a different default value). So the knob that's most easily available to the deployer of the binary (an environment variable) cannot control this setting; the only (partial) knob that's available is debug.SetTraceback.

For my needs, I don't particularly care whether the solution is debug.SetTraceback, some new runtime/debug function, GOTRACEBACK, some new environment variable, or some other way entirely. But I need to be able to turn these stack traces back on.

griesemer commented 1 month ago

This probably needs a proposal to move forward.

cespare commented 1 month ago

I can turn this into a proposal, but I'd appreciate some feedback about what direction the runtime/security folks would find most palatable.

By the way, I should say that from my perspective as a user, I'm not asking for a new feature but rather a fix for a regression. We had a program that produced stack traces when it crashed, but since the changes in #60272, it has produced useless empty output instead. This includes cases where tests deadlock and produce no output when they are killed with SIGQUIT, making them impossible to debug. (The test binary is given CAP_SYS_ADMIN in order to exercise certain containerization functionality.)

mknyszek commented 1 month ago

In compiler/runtime triage, we think this might be best as a new API (not exactly sure what form it should take), though whatever we do needs @rolandshoemaker's input.