golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.59k stars 17.61k forks source link

runtime/trace: missing stacks for trace events emitted when `getg().mp.curg == nil`, like `GoCreate` for `time.AfterFunc` #68093

Open mknyszek opened 4 months ago

mknyszek commented 4 months ago

Here is a simple program that reproduces the issue:

package main

import (
    "log"
    "os"
    "runtime/trace"
    "time"
)

func main() {
    f, err := os.Create("trace.out")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()

    trace.Start(f)

    time.AfterFunc(1*time.Second, func() {
        println("hello!")
    })

    time.Sleep(5 * time.Second)

    trace.Stop()
}

In this trace, the GoCreate event that creates the goroutine to run println("hello") will have a stack trace for the new goroutine (really, the start PC) but will not have the stack trace in the scheduler/timer code that actually creates the goroutine.

The reason for this is simple: traceStack, the function that obtains a stack trace for the tracer, skips taking a stack if it's called from a system stack. This has been true as far back as I can find. The reason why it does this is because it uses the fact that it runs on a system stack as a signal that it should collect gp.mp.curg's stack instead. This is reasonable -- if a goroutine calls into the scheduler to yield, it'll be running on a system stack when we emit the trace event, but we crucially do not care about that location, but where the user code was when we yielded.

The fix here is, I think, straightforward: collect a trace from a system stack if and only if gp.mp.curg is nil. This should be sufficient to capture the time.AfterFunc case and cases like it.

gabyhelp commented 4 months ago

Similar Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

gopherbot commented 4 months ago

Change https://go.dev/cl/593835 mentions this issue: runtime: collect stacks from g0 for traces