golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.16k stars 17.69k forks source link

runtime/debug: PrintStack function prints incorrect stack information in go1.21.0 #62132

Closed liuzengh closed 1 year ago

liuzengh commented 1 year ago

What version of Go are you using (go version)?

$ go version
go version go1.21.0 darwin/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE='on'
GOARCH='arm64'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOOS='darwin'
GOTOOLDIR='/Users/amdahliu/go/go1.21.0/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.21.0'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/_f/77ggdqd951g1fc16zybympzc0000gn/T/go-build2493489500=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

I write a unit test:

func TestRuntimeDebugPrintStack(t *testing.T) {
    err := func() error {
        return func() error {
            return func() error {
                return func() error {
                    debug.PrintStack()
                    return newMyErr(11, "TestTraceErrorSetStackSkip error")
                }()

            }()
        }()
    }()
    require.NotNil(t, err)
}

What did you see?

There are two TestRuntimeDebugPrintStack appearing.

Original Screenshot Pasted Graphic
=== RUN   TestRuntimeDebugPrintStack
goroutine 6 [running]:
runtime/debug.Stack()
    /Users/amdahliu/go/go1.21.0/src/runtime/debug/stack.go:24 +0x64
runtime/debug.PrintStack()
    /Users/amdahliu/go/go1.21.0/src/runtime/debug/stack.go:16 +0×1c
trpc.group/trpc-go/trpc-go/errs_test.TestRuntimeDebugPrintStack.TestRuntimeDebugPrintStack.func1.func2.1.1()
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:151 +0×1c
trpc.group/trpc-go/trpc-go/errs_test.TestRuntimeDebugPrintStack.TestRuntimeDebugPrintStack.func1.func2.1()
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:153 +0x1c
trpc.group/trpc-go/trpc-go/errs_test.TestRuntimeDebugPrintStack.TestRuntimeDebugPrintStack.func1.func2()
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:155 +0×1c
trpc.group/trpc-go/trpc-go/errs_test.TestRuntimeDebugPrintStack.func1(...)
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:156
trpc.group/trpc-go/trpc-go/errs_test.TestRuntimeDebugPrintStack(0x0?)
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:157 +0×24
testing.tRunner(0x1400015440, 0×10473c838)
    /Users/amdahliu/go/go1.21.0/src/testing/testing.go:1595 +0xe8
created by testing.(*T).Run in goroutine 1
    /Users/amdahliu/go/go1.21.0/src/testing/testing.go:1648 +0x33c
--- PASS: TestRuntimeDebugPrintStack (0.00s)
PASS

What did you expect to see?

But only one TestRuntimeDebugPrintStack is expected. I expect to see a result similar to the following in go version go1.20.4 darwin/arm64.

Original Screenshot Pasted Graphic 1
=== RUN   TestTraceErrorSetStackSkip
goroutine 18 [running]:
runtime/debug.Stack()
    /opt/homebrew/opt/go/libexec/src/runtime/debug/stack.go:24 +0×64
runtime/debug.PrintStack()
    /opt/homebrew/opt/go/libexec/src/runtime/debug/stack.go:16 +0×1c
trpc.group/trpc-go/trpc-go/errs_test.TestTraceErrorSetStackSkip.func1.1.1.1()
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:169 +0x1c
trpc.group/trpc-go/trpc-go/errs_test.TestTraceErrorSetStackSkip.func1.1.1()
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:171•+0x1c
trpc.group/trpc-go/trpc-go/errs_test.TestTraceErrorSetStackSkip.func1.1()
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:173+0x1c
trpc.group/trpc-go/trpc-go/errs_test.TestTraceErrorSetStackSkip.func1()
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:174 +0x1c
trpc.group/trpc-go/trpc-go/errs_test.TestTraceErrorSetStackSkip(0x1400011cd00)
    /Users/amdahliu/GolandProjects/github/trpc-go/errs/errstest.go:175 +0x4c
testing.tRunner(0×1400011cd00, 0x102523568)
    /opt/homebrew/opt/go/libexec/src/testing/testing.go:1576 +0×10c
created by testing.(*T).Run
    /opt/homebrew/opt/go/libexec/src/testing/testing.go:1629 +0×368
dmitshur commented 1 year ago

CC @golang/runtime.

cherrymui commented 1 year ago

There are two TestRuntimeDebugPrintStack appearing.

You mean two TestRuntimeDebugPrintStack in function names like TestRuntimeDebugPrintStack.TestRuntimeDebugPrintStack.func1.func2 ? The names for function literals are not guaranteed to be stable, and in 1.21 we changed how function literals are named. So this is technically allowed and not incorrect.

The two TestRuntimeDebugPrintStack do seem a bit clutter, though. Maybe we could do something better when a closure (which contain inner closures) is inlined into its containing function. cc @mdempsky

cherrymui commented 1 year ago

@liuzengh for next time, please paste output as text, instead of image. Thanks.

liuzengh commented 1 year ago

Yes, I mean two TestRuntimeDebugPrintStack in function names like TestRuntimeDebugPrintStack.TestRuntimeDebugPrintStack.func1.func2.

Ok, I will paste output as text for next issue. This is my first time opening an issue for golang, I hope you don't mind.

mitar commented 1 year ago

I was just about to report this issue as well. I made reproduction here. Compare Go output from Go 1.20 and 1.21. I noticed this because my errors package tests started failing with Go 1.21. I check for how stack trace is formatted there.

I do not mind so much that func1.1 changed to func1.func2, but that main is twice is very verbose (in example I made you can see it three times but this is because the package name is also main).

aclements commented 1 year ago

This is working as intended, but we agree it's confusing. We're exploring other naming options over in #60324. I'll close this issue in favor of the discussion on #60324.

mitar commented 1 year ago

Are you sure this is the same? https://github.com/golang/go/issues/60324 talks about the change from main.f.func1 to main.main.func1. But this issue talks about the issue (if I use same symbol names) of the change from main.f.func1 to main.main.func1.1 or something like that. So it is not just inlining. In this issue we are observing increase in the number of segments (duplicate the first one). And not just that a closure is named badly because of the inlining.

aclements commented 1 year ago

I'm certainly happy to reopen this if it's a different issue, but I'm not sure I follow your argument for why it's not the same.

Here's the original example with names so I can refer to the closures:

func TestRuntimeDebugPrintStack(t *testing.T) {
    err := func() error {  // F1
        return func() error {  // F2
            return func() error {  // F3
                return func() error {  // F4
                    debug.PrintStack()
                    return newMyErr(11, "TestTraceErrorSetStackSkip error")
                }()

            }()
        }()
    }()
    require.NotNil(t, err)
}

In Go 1.21, we started more aggressively inlining functions that themselves constructed closures. Prior to Go 1.21, none of these got inlined, and thus we see their "original" names based on hierarchical numbering in the traceback.

In Go 1.21, F1 gets inlined into TestRuntimeDebugPrintStack. The inlined copy of F1's body needs to clone F2, F3, and F4 because they may be optimized differently based on the context F1 was inlined into (though in this case I don't think they do). Call those clones F2', F3', and F4'. We currently give those clones different symbol names to disambiguate them (this happened in Go 1.20 as well, albeit following different rules, but it doesn't matter here because Go 1.20 didn't perform any inlining in this example).

F2's pre-inlined name is TestRuntimeDebugPrintStack.func1.1 (the easiest way to see this is to pass -gcflags -l). But let's pull out F1 and give it its own name:

func TestRuntimeDebugPrintStack(t *testing.T) {
    F1()
}
func F1() error {  // F1
    return func() error {  // F2
        return func() error {  // F3
            return func() error {  // F4
                debug.PrintStack()
                return nil
            }()
        }()
    }()
}

If we run this, we get:

x.TestRuntimeDebugPrintStack.F1.func1.1.1()
    /home/austin/go.dev/austin/62132/main_test.go:28 +0xf
x.TestRuntimeDebugPrintStack.F1.func1.1()
    /home/austin/go.dev/austin/62132/main_test.go:30 +0xf
x.TestRuntimeDebugPrintStack.F1.func1()
    /home/austin/go.dev/austin/62132/main_test.go:31 +0xf
x.F1(...)
    /home/austin/go.dev/austin/62132/main_test.go:32
x.TestRuntimeDebugPrintStack(0xc000007860?)
    /home/austin/go.dev/austin/62132/main_test.go:22 +0x10

Here we can again see that F1 got inlined into TestRuntimeDebugPrintStack. The original name of F2 is F1.func1, but when F1 get inlined, the name of F2' is TestRuntimeDebugPrintStack.F1.func1 because the compiler needed to disambiguate the particular copy of F2 inlined into TestRuntimeDebugPrintStack.

Finally we can get to the crux of what's going on: recall that in the original example, the original name of F2 is TestRuntimeDebugPrintStack.func1.1. If we do the same transformation as in the second example, we get TestRuntimeDebugPrintStack.TestRuntimeDebugPrintStack.func1.1. Due to other vagaries, the end of the name changes a little, but this shows where the doubling comes from. The first TestRuntimeDebugPrintStack is the context F2 is inlined into, and the second TestRuntimeDebugPrintStack is the context it came from.

These inlining breadcrumbs are what we're discussing in #60324. This happens to be a good example of where the current convention has somewhat odd, verbose, and unexpected results.

(Sorry that got rather long! I was also convincing myself that it's in fact the same issue :sweat_smile:)

mitar commented 1 year ago

I see. Thanks for detailed explanation. To me the confusing thing is that the number of function names shown changes between:

TestRuntimeDebugPrintStack.TestRuntimeDebugPrintStack.func1.func2.1.1 // 6 function names

and

TestTraceErrorSetStackSkip.func1.1.1.1 // 5 function names

I can understand that how names are formatted changes. Or that one name inherits the name from another function. But how could the number of functions change? I think I get it now. Thanks.