golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.9k stars 17.65k forks source link

cmd/link: the value stored in pcHeader in `runtime.firstmoduledata` is different from `runtime.pclntab` when using external links on `macOS arm64 cgo` #69428

Closed Zxilly closed 1 month ago

Zxilly commented 1 month ago

Go version

go version devel go1.24-2a10a5351b Wed Aug 14 12:32:08 2024 +0800 windows/amd64

Output of go env in your module/workspace:

set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\zxilly\AppData\Local\go-build
set GOENV=C:\Users\zxilly\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\zxilly\go\pkg\mod
set GONOPROXY=1
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\zxilly\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=E:/Temp/go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLCHAIN=auto
set GOTOOLDIR=E:\Temp\go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=devel go1.24-2a10a5351b Wed Aug 14 12:32:08 2024 +0800
set GODEBUG=
set GOTELEMETRY=local
set GOTELEMETRYDIR=C:\Users\zxilly\AppData\Roaming\go\telemetry
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=NUL
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=C:\Users\zxilly\AppData\Local\Temp\go-build119807883=/tmp/go-build -gno-record-gcc-switches

What did you do?

I'm trying to extract the moduledata.pcHeader value from the binaries, and the integration test shows this error since go1.21 release.

Prior to go 1.21, the runtime.text symbol would still be present in the binary even if the -s flag was passed, masking the problem.

What did you see happen?

the value stored in pcHeader in runtime.firstmoduledata is different from runtime.pclntab when using external links on macOS arm64 cgo

Can be reproduced with the following code:

package main

import (
    "debug/macho"
    "encoding/binary"
    "fmt"
)

func main() {
    f, err := macho.Open("bin-darwin-1.23-arm64-cgo")
    if err != nil {
        panic(err)
    }

    pclntabAddr := uint64(0)
    moduledataAddr := uint64(0)

    for _, s := range f.Symtab.Syms {
        if s.Name == "runtime.firstmoduledata" {
            moduledataAddr = s.Value
        }
        if s.Name == "runtime.pclntab" {
            pclntabAddr = s.Value
        }
    }

    if moduledataAddr == 0 {
        panic("runtime.firstmoduledata not found")
    }
    if pclntabAddr == 0 {
        panic("runtime.pclntab not found")
    }

    // read first 8 bytes of runtime.moduledata
    data := make([]byte, 8)
    for _, prog := range f.Sections {
        if prog.Addr <= moduledataAddr && moduledataAddr+8-1 <= prog.Addr+prog.Size {
            if _, err := prog.ReadAt(data, int64(moduledataAddr-prog.Addr)); err != nil {
                panic(err)
            }
            break
        }
    }

    // transfer the pclntabAddr to a byte array, little endian
    pclntabAddrBinary := make([]byte, 8)
    binary.LittleEndian.PutUint64(pclntabAddrBinary, pclntabAddr)

    fmt.Println("runtime.firstmoduledata.pcHeader ptr:")
    for _, b := range pclntabAddrBinary {
        fmt.Printf("%02x ", b)
    }
    fmt.Println()
    // print the first 8 bytes of runtime.pclntab as hex
    fmt.Println("runtime.pclntab:")
    for _, b := range data {
        fmt.Printf("%02x ", b)
    }
    fmt.Println()
}

Will results to

runtime.firstmoduledata.pcHeader ptr:
40 26 2a 00 01 00 00 00 
runtime.pclntab:
40 26 2a 00 00 00 10 00 

Some scripts indicates similiar problem on most recent major Go releases:

// format
// pclntab addr value
// generate search hash
// real value in the binary

19
pclntabAddr 4295817152
c0 f7 0c 00 01 00 00 00 
c0 f7 0c 00 00 00 10 00

20
pclntabAddr 4295805984
20 cc 0c 00 01 00 00 00 
20 cc 0c 00 00 00 10 00

21
pclntabAddr 4298061728
a0 37 2f 00 01 00 00 00 
a0 37 2f 00 00 00 10 00

22
mdAddr 4298416864 mdSect __noptrdata off 6112
pclntabAddr 4297689728
80 8a 29 00 01 00 00 00 
80 8a 29 00 00 00 10 00

23
pclntabAddr 4297729600
40 26 2a 00 01 00 00 00 
40 26 2a 00 00 00 10 00

This only happens on macOS arm64 with external link.

Here's some binary as input:

https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.22-arm64-strip-cgo https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.22-arm64-strip-pie-cgo https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.23-arm64-strip-cgo https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.23-arm64-strip-pie-cgo

What did you expect to see?

These value should keep same as other arch.

gabyhelp commented 1 month ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Zxilly commented 1 month ago

This is an advisory issue rather than a bug, as it doesn't seem to affect the operation of the Go binary.

ianlancetaylor commented 1 month ago

CC @cherrymui @golang/runtime

cherrymui commented 1 month ago

I think the moduledata references pcheader with a dynamic relocation (or the equivalent on Mach-O, a "bind" or "rebase" entry). The data you read directly from moduledata is without the relocation applied. You'll need to decode and apply the relocation.

It is possible that the Go linker and C linker (or even different versions of the C linker) use different relocation mechanism, and the pre-relocated data may or may not be meaningful. They are semantically equivalent so at program run time it always point to the right data.

I'm not sure if there is anything we can do.

Zxilly commented 1 month ago

If that's the reason, I'm wondering why buildmode being exe or pie doesn't change this behaviour. I looked up some other issues, is it because Apple doesn't support generating non-pie binaries anymore either? Similarly, this behaviour is not observed on amd64.

cherrymui commented 1 month ago

is it because Apple doesn't support generating non-pie binaries anymore either?

That is correct.

Again, the un-relocated data varies depending on architecture, link mode, linker version, etc.. It may happen to be useful, or not.

Zxilly commented 1 month ago

I investigated the issue further and it doesn't seem to be a relocation issue, please check the attached image, the corresponding sections for moduledata and pclntab all have Nreloc of 0

image

image

This can also be verified by llvm-otool

``` bin-darwin-1.23-arm64-pie-cgo: Mach header magic cputype cpusubtype caps filetype ncmds sizeofcmds flags 0xfeedfacf 16777228 0 0x00 2 21 3440 0x00200085 Load command 0 cmd LC_SEGMENT_64 cmdsize 72 segname __PAGEZERO vmaddr 0x0000000000000000 vmsize 0x0000000100000000 fileoff 0 filesize 0 maxprot 0x00000000 initprot 0x00000000 nsects 0 flags 0x0 Load command 1 cmd LC_SEGMENT_64 cmdsize 632 segname __TEXT vmaddr 0x0000000100000000 vmsize 0x0000000000268000 fileoff 0 filesize 2523136 maxprot 0x00000005 initprot 0x00000005 nsects 7 flags 0x0 Section sectname __text segname __TEXT addr 0x0000000100002080 size 0x0000000000239340 offset 8320 align 2^4 (16) reloff 0 nreloc 0 flags 0x80000400 reserved1 0 reserved2 0 Section sectname __stubs segname __TEXT addr 0x000000010023b3c0 size 0x0000000000000594 offset 2339776 align 2^2 (4) reloff 0 nreloc 0 flags 0x80000408 reserved1 0 (index into indirect symbol table) reserved2 12 (size of stubs) Section sectname __rodata segname __TEXT addr 0x000000010023b960 size 0x000000000001c28c offset 2341216 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __const segname __TEXT addr 0x0000000100257bf0 size 0x0000000000006bb0 offset 2456560 align 2^4 (16) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __cstring segname __TEXT addr 0x000000010025e7a0 size 0x0000000000008129 offset 2484128 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000002 reserved1 0 reserved2 0 Section sectname __unwind_info segname __TEXT addr 0x00000001002668cc size 0x0000000000001670 offset 2517196 align 2^2 (4) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __eh_frame segname __TEXT addr 0x0000000100267f40 size 0x00000000000000b8 offset 2522944 align 2^3 (8) reloff 0 nreloc 0 flags 0x6800000b reserved1 0 reserved2 0 Load command 2 cmd LC_SEGMENT_64 cmdsize 552 segname __DATA_CONST vmaddr 0x0000000100268000 vmsize 0x00000000000e8000 fileoff 2523136 filesize 950272 maxprot 0x00000003 initprot 0x00000003 nsects 6 flags 0x10 Section sectname __got segname __DATA_CONST addr 0x0000000100268000 size 0x00000000000003c8 offset 2523136 align 2^3 (8) reloff 0 nreloc 0 flags 0x00000006 reserved1 119 (index into indirect symbol table) reserved2 0 Section sectname __const segname __DATA_CONST addr 0x00000001002683c8 size 0x0000000000002ce0 offset 2524104 align 2^3 (8) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __rodata segname __DATA_CONST addr 0x000000010026b0c0 size 0x0000000000036820 offset 2535616 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __typelink segname __DATA_CONST addr 0x00000001002a18e0 size 0x0000000000000b88 offset 2758880 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __itablink segname __DATA_CONST addr 0x00000001002a2480 size 0x00000000000001a8 offset 2761856 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __gopclntab segname __DATA_CONST addr 0x00000001002a2640 size 0x00000000000abd88 offset 2762304 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Load command 3 cmd LC_SEGMENT_64 cmdsize 552 segname __DATA vmaddr 0x0000000100350000 vmsize 0x0000000000040000 fileoff 3473408 filesize 98304 maxprot 0x00000003 initprot 0x00000003 nsects 6 flags 0x0 Section sectname __data segname __DATA addr 0x0000000100350000 size 0x0000000000008e00 offset 3473408 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __go_buildinfo segname __DATA addr 0x0000000100358e00 size 0x00000000000002c0 offset 3509760 align 2^4 (16) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __noptrdata segname __DATA addr 0x00000001003590c0 size 0x000000000000be80 offset 3510464 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __bss segname __DATA addr 0x0000000100364f40 size 0x0000000000026ce8 offset 0 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000001 reserved1 0 reserved2 0 Section sectname __noptrbss segname __DATA addr 0x000000010038bc40 size 0x0000000000003760 offset 0 align 2^5 (32) reloff 0 nreloc 0 flags 0x00000001 reserved1 0 reserved2 0 Section sectname __common segname __DATA addr 0x000000010038f3a0 size 0x0000000000000020 offset 0 align 2^3 (8) reloff 0 nreloc 0 flags 0x00000001 reserved1 0 reserved2 0 Load command 4 cmd LC_SEGMENT_64 cmdsize 72 segname __LINKEDIT vmaddr 0x0000000100390000 vmsize 0x000000000005d542 fileoff 5865472 filesize 382274 maxprot 0x00000001 initprot 0x00000001 nsects 0 flags 0x0 Load command 5 cmd LC_DYLD_CHAINED_FIXUPS cmdsize 16 dataoff 5865472 datasize 2352 Load command 6 cmd LC_DYLD_EXPORTS_TRIE cmdsize 16 dataoff 5867824 datasize 7184 Load command 7 cmd LC_SYMTAB cmdsize 24 symoff 5884664 nsyms 7451 stroff 6004840 strsize 194336 Load command 8 cmd LC_DYSYMTAB cmdsize 80 ilocalsym 0 nlocalsym 6935 iextdefsym 6935 nextdefsym 383 iundefsym 7318 nundefsym 133 tocoff 0 ntoc 0 modtaboff 0 nmodtab 0 extrefsymoff 0 nextrefsyms 0 indirectsymoff 6003880 nindirectsyms 240 extreloff 0 nextrel 0 locreloff 0 nlocrel 0 Load command 9 cmd LC_LOAD_DYLINKER cmdsize 32 name /usr/lib/dyld (offset 12) Load command 10 cmd LC_UUID cmdsize 24 uuid 7537C0A0-53D9-B259-D966-C77D22FB7B3B Load command 11 cmd LC_BUILD_VERSION cmdsize 32 platform 1 sdk 14.5 minos 14.0 ntools 1 tool 3 version 1053.12 Load command 12 cmd LC_SOURCE_VERSION cmdsize 16 version 0.0 Load command 13 cmd LC_MAIN cmdsize 24 entryoff 474800 stacksize 0 Load command 14 cmd LC_LOAD_DYLIB cmdsize 104 name /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (offset 24) time stamp 2 Thu Jan 1 08:00:02 1970 current version 2503.1.0 compatibility version 150.0.0 Load command 15 cmd LC_LOAD_DYLIB cmdsize 56 name /usr/lib/libresolv.9.dylib (offset 24) time stamp 2 Thu Jan 1 08:00:02 1970 current version 1.0.0 compatibility version 1.0.0 Load command 16 cmd LC_LOAD_DYLIB cmdsize 56 name /usr/lib/libSystem.B.dylib (offset 24) time stamp 2 Thu Jan 1 08:00:02 1970 current version 1345.120.2 compatibility version 1.0.0 Load command 17 cmd LC_FUNCTION_STARTS cmdsize 16 dataoff 5875008 datasize 9656 Load command 18 cmd LC_DATA_IN_CODE cmdsize 16 dataoff 5884664 datasize 0 Load command 19 cmd LC_CODE_SIGNATURE cmdsize 16 dataoff 6199184 datasize 48562 Load command 20 cmd LC_SEGMENT_64 cmdsize 1032 segname __DWARF vmaddr 0x0000000000000000 vmsize 0x0000000000000000 fileoff 3571712 filesize 2283036 maxprot 0x00000007 initprot 0x00000000 nsects 12 flags 0x0 Section sectname __zdebug_line segname __DWARF addr 0x00000001003dc000 size 0x0000000000066f7e offset 3571712 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zdebug_ranges segname __DWARF addr 0x0000000100442f7e size 0x000000000000fd8e offset 3993470 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zdebug_loc segname __DWARF addr 0x0000000100452d0c size 0x000000000007cb4a offset 4058380 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zdebug_aranges segname __DWARF addr 0x00000001004cf856 size 0x0000000000000508 offset 4569174 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zdebug_info segname __DWARF addr 0x00000001004cfd5e size 0x00000000000d39f2 offset 4570462 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zdebug_frame segname __DWARF addr 0x00000001005a3750 size 0x000000000000a7f6 offset 5437264 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zdebug_abbrev segname __DWARF addr 0x00000001005adf46 size 0x0000000000000531 offset 5480262 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zdebug_str segname __DWARF addr 0x00000001005ae477 size 0x000000000001a212 offset 5481591 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zapple_names segname __DWARF addr 0x00000001005c8689 size 0x0000000000033da1 offset 5588617 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __apple_namespac segname __DWARF addr 0x00000001005fc42a size 0x0000000000000024 offset 5801002 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __zapple_types segname __DWARF addr 0x00000001005fc44e size 0x000000000000d1aa offset 5801038 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Section sectname __apple_objc segname __DWARF addr 0x00000001006095f8 size 0x0000000000000024 offset 5854712 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 ```
cherrymui commented 1 month ago

Mach-O doesn't use relocations in that sense. They have "rebase" and "bind" tables instead. Try objdump --macho --bind and objdump --macho --rebase with the macOS system objdump (from Xcode).

cherrymui commented 1 month ago

For a binary I built, I have

$ nm x | grep firstmoduledata
00000001000f7b00 s _runtime.firstmoduledata
$ nm x | grep runtime.pclntab
000000010009f820 s _runtime.pclntab
$ objdump -m --rebase ./x | grep 0x1000F7B00
...
__DATA   __noptrdata        0x1000F7B00  rebase ptr   0x10009F820

The rebase entry does point to the right address.

As you mentioned above, this is not a bug. And I don't think there is much we can do. Thanks.

Zxilly commented 1 month ago

Thanks for the suggestion. But I tried llvm-objdump --macho --bind and llvm-objdump --macho --rebase on my samples and it seems that the rebase table does not exist in these files, but the address discrepancy is still there. Specifically, I am referring to the file

https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.23-arm64-strip-pie-cgo

Can you please suggest me some other directions?

I got

PS T:\> llvm-objdump --macho --rebase .\bin-darwin-1.23-arm64-strip-pie-cgo
.\bin-darwin-1.23-arm64-strip-pie-cgo:

Rebase table:
segment  section            address     type
PS T:\> llvm-objdump --macho --bind .\bin-darwin-1.23-arm64-strip-pie-cgo
.\bin-darwin-1.23-arm64-strip-pie-cgo:

Bind table:
segment  section            address    type       addend dylib            symbol

on this file.

Zxilly commented 1 month ago

I guess it was related to Chained Fixups

cherrymui commented 1 month ago

You're probably right that chained fixups are related. That is yet another way of expressing dynamic relocations in Mach-O.