golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.91k stars 17.65k forks source link

debug/macho: leading underscore stripped for some C++ symbols #59022

Open BrytonLee opened 1 year ago

BrytonLee commented 1 year ago

What version of Go are you using (go version)?

$ go version
go version go1.19.5 darwin/arm64

Does this issue reproduce with the latest release?

Yes. There is no change since change 06e5529

The lates code is here https://github.com/golang/go/blob/master/src/debug/macho/file.go#L503

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/lichengdong/Library/Caches/go-build"
GOENV="/Users/lichengdong/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/lichengdong/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/lichengdong/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.19.5/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.19.5/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.19.5"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/g4/4_y2t8q179s1n1h40218_3rm0000gn/T/go-build3619303289=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Compare a C++ source code generated binary, and use system nm/objdump to view the symbols in that binary file. Meanwhile use debug/macho to dump symbols.

For example, a C++ symbol __GLOBAL__sub_I_main.mm in nm/objdump will be dropped the first leading underscore _ in debug/macho.

What did you expect to see?

same result as tool nm/objdump

What did you see instead?

nm/objdump output

cherrymui commented 1 year ago

This is unfortunate, but I'm not sure what the best option is. Maybe we could keep the underscore if there are two leading underscores? There is always some inaccuracy, as in the end one can write a symbol name that looks like a Go symbol but not really a Go symbol.

What is your use case? Is there a simple workaround that you can do? Thanks.

ianlancetaylor commented 1 year ago

It seems to me that it's a mistake for a package like debug/macho to tamper with symbol names. If we want tools like cmd/nm and cmd/objdump to drop a leading underscore, that should be handled elsewhere, perhaps with a list of formats for which a leading underscore is added.

For example, for windows-386 all symbols have a leading underscore, but debug/pe doesn't strip that leading underscore as far as I can tell.

cherrymui commented 1 year ago

We could change debug/macho to not strip the leading underscore. We could also make cmd/nm and cmd/objdump not to do that. Then it would be identical the the symbol names in the binary's symbol table. Not sure if that would break anyone, though (as it will have a leading underscore for some symbols which were not there before).

BrytonLee commented 1 year ago

This is unfortunate, but I'm not sure what the best option is. Maybe we could keep the underscore if there are two leading underscores? There is always some inaccuracy, as in the end one can write a symbol name that looks like a Go symbol but not really a Go symbol.

What is your use case? Is there a simple workaround that you can do? Thanks.

A normal C++ program with static class would generate symbols like GLOBALsub_I*, I didn't try any workaround. But I think if compare symbol name with "GLOBALsub_I" may work for C++ static class/variable symbols.

However, I am not sure if this way breaks other tools, I am not a C++ expert. :-(

BrytonLee commented 1 year ago

Yes. I believe that makes people confused when they compile go program like this https://github.com/golang/go/issues/33808. They will find that some symbols (exported symbols?) have a leading _ which is different from the name in source code.

IMO, we may need to distinguish golang from other language (like C++) symbols. Then treat these cases separately.

cherrymui commented 1 year ago

Too late for 1.21. I'll do it in Go 1.22. (This is a user-visible behavior change, so it is probably not a good idea to do this late in the freeze.)

cherrymui commented 11 months ago

I missed this before 1.22 freeze. I'll do this early next cycle.