golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.99k stars 17.67k forks source link

cmd/link: ppc64 (big endian) cgo errors #13192

Open laboger opened 9 years ago

laboger commented 9 years ago

I've built golang from master with the patches from issue 11184 to get external linking to work with ppc64le. That seems to work well on ppc64le with external linking and cgo.

In cmd/dist/build.go, linux/ppc64le is in the cgoEnabled map but linux/ppc64 is not. So the build of golang with the latest patches on ppc64 does not quite work with cgo.

If I build the golang toolchain on ppc64 with CGO_ENABLED=1, I first get this error:

cannot use dynamic imports with -d flag

I made a change to cmd/link/internal/ppc64/obj.go to get rid of this message, but then hit this error: /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? ...... runtime/cgo(.opd): unexpected relocation type 307 ....... and then too many errors

I can see that relocation type 51 is R_PPC64_TOC and that is not handled by the code. I tried adding those defines but not sure what should be generated for this relocation type. If there are suggestions on what to do I can try them out.

laboger commented 9 years ago

I've been doing this on a RHEL7.2 system.

mwhudson commented 9 years ago

Those errors suggest that you're not actually linking externally. Try adding -ldflags=-linkmode=external to your go tool invocation.

I'm pretty sure that things still won't work though.

On 10 November 2015 at 09:25, laboger notifications@github.com wrote:

I've been doing this on a RHEL7.2 system.

— Reply to this email directly or view it on GitHub https://github.com/golang/go/issues/13192#issuecomment-155182453.

laboger commented 9 years ago

These errors happen during the build of golang on ppc64 if I build with CGO_ENABLED=1. They don't happen if CGO_ENABLED is not set.

Under ##### Building packages and commands for linux/ppc64. ..... cmd/pprof net/rpc/jsonrpc

cmd/trace

/home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/net.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? /home/boger/golang/gitsrc/latest/go/pkg/linux_ppc64/runtime/cgo.a(_all.o): unknown relocation type 51; compiled without -fpic? runtime/cgo(.opd): unexpected relocation type 307 runtime/cgo(.opd): unexpected relocation type 307 /home/boger/golang/gitsrc/latest/go/pkg/tool/linux_ppc64/link: too many errors

mwhudson commented 9 years ago

@laboger so you're saying that the patches from issue 11184 break building go on ppc64? that's bad, I'll make sure to fix that. I'm not promising to make external linking work on ppc64 though (which is what I thought this was about).

laboger commented 9 years ago

I'm sorry that I wasn't clear. I'm not saying your patches for 11184 break the build of ppc64. The above errors occur if I build on ppc64 and set CGO_ENABLED=1. The errors are the same or similar if I try to do the same build without your patches on ppc64.

You are correct, the point of this issue is to document the errors that I've seen with external linking on ppc64. I used your patches in my testing just because they are the latest changes I'm aware of for linking on Power. I understand you are not promising to make it work for ppc64, but I'm trying to understand how much is missing from external linking for ppc64.

laboger commented 9 years ago

This issue should not affect the merging of your patches from 11184. They work great and I know of no problem with them. They are for ppc64le, this issue is ppc64.

ianlancetaylor commented 9 years ago

Just to make sure we're all on the same page, these errors are from internal linking, which is what the toolchain does when building the tools. Presumably something has changed causing the C compiler to generate R_PPC64_TOC relocs. The code in cmd/link/internal/ppc64/asm.go needs to handle R_POWER_TOC, by resolving it to the value of symtoc. The code in cmd/link/internal/ld/ldelf.go needs to recognize R_PPC64_TOC, setting the size to 8.

mwhudson commented 9 years ago

FWIW, I'm not sure that cgo has ever worked on ppc64 BE (ping @aclements @minux), so it might not be C compiler changes to blame.

minux commented 9 years ago

I don't think cgo works on BE (I have an old patch set to enable external linking, but it needs major updating after Michael's changes.)

Austin's internal linking work is for ppc64le.

What's the version of gcc?

laboger commented 9 years ago

I have some old patches that Minux gave me back on May to make external linking work. There are only a few ppc64 BE specific changes that aren't in Michael's. I will try adding them and see if that helps. Perhaps something in his patches prevents the generation of R_PPC64_TOC.

My RHEL 7.2 ppc64 BE machine defaults to gcc 4.8.5.

laboger commented 9 years ago

Here are the CLs that I am aware of related to ppc64 BE linking which match the patches I had previously used from Minux: https://go-review.googlesource.com/#/c/9677 change to obj.go no longer needed due to MHDs change https://go-review.googlesource.com/#/c/9676 change to tls_ppc64x.s no longer needed due to MHDs change in TLS handling I applied the other changes from this CL. https://go-review.googlesource.com/#/c/9673 change to obj.go no longer needed

The file asm.go from 9677 and 9673 is a question because I see there are conflicting changes now with what's upstream.

I tried to add support to handle R_PPC64_TOC based on Ian's comments above but was unsure what to put in adddynrel for the R_PPC64_TOC case.

And then after making those changes, I see errors like this:

TOC-relative relocation in object without .TOC.

I don't see where a .TOC. object would get generated so it would be found when calling Linkrlookup.

I'm not sure what the next step is to make this work. I agree with previous comments that when CGO_ENABLED=1 on ppc64 BE this has probably never worked. I can see that gcc generates .opd sections with the R_PPC64_TOC relocation type and that happens even with older versions of gcc. I found this .opd section in some of the _all.o files that are generated during the golang build that are generating the error message about the missing relocation type.

rsc commented 8 years ago

We're not going to get to this for Go 1.6.

mwhudson commented 8 years ago

This bug is assigned to me but I'm not in a position to work on it. Can someone unassign me? (I don't think I have permissions to do that!)

alexbrainman commented 8 years ago

Unassigned.

laboger commented 8 years ago

Can I get information on what is left to be done to get this to work?

On 12/04/2015 11:55 PM, Russ Cox wrote:

We're not going to get to this for Go 1.6.

— Reply to this email directly or view it on GitHub https://github.com/golang/go/issues/13192#issuecomment-162146625.

rsc commented 8 years ago

@laboger, my understanding is that the patches Minux posted a while back do work (I think you've said that), but they no longer apply cleanly, and Minux has not had time to update them to the latest Go tree. (Minux is a volunteer; his day job is being a grad student.)

If you or anyone else would like to take over those patches, get them to apply to the tree, and send them in for review, we can get them in for Go 1.6.

More generally, my understanding was that ppc64 (big-endian) was not as important as ppc64le, and that ppc64le support is committed and working. If that's wrong please let me know. Thanks.

mwhudson commented 8 years ago

I don't know if ppc64 is less important than ppc64le in general, but certainly to me (and Canonical) it is, as there is no Ubuntu port to ppc64 (and I don't have access to ppc64 hw, so even if I wanted to work on it, in practice I can't).

I don't think there is an enormous amount to be done.

laboger commented 8 years ago

Yes your understanding is true -- ppc64 for big endian is not as important at this time. It was very important to get ppc64le working in Go 1.6 with external linking and that seems to be working well and that is a very good thing.

This question is mainly for my understanding, in case the importance of BE changes and I'm asked what it would take to make it work. I wasn't sure if it was thought that Minux' previous patches contained most of the needed function or if there were some known pieces missing. I might have thought at one time they worked OK on BE but now I'm not sure on that because I probably didn't understand the variations that needed testing.

On 12/07/2015 10:48 AM, Russ Cox wrote:

@laboger https://github.com/laboger, my understanding is that the patches Minux posted a while back do work (I think you've said that), but they no longer apply cleanly, and Minux has not had time to update them to the latest Go tree. (Minux is a volunteer; his day job is being a grad student.)

If you or anyone else would like to take over those patches, get them to apply to the tree, and send them in for review, we can get them in for Go 1.6.

More generally, my understanding was that ppc64 (big-endian) was not as important as ppc64le, and that ppc64le support is committed and working. If that's wrong please let me know. Thanks.

— Reply to this email directly or view it on GitHub https://github.com/golang/go/issues/13192#issuecomment-162588516.

Can I get information on what is left to be done to get this to work?

On 12/04/2015 11:55 PM, Russ Cox wrote:

We're not going to get to this for Go 1.6.

— Reply to this email directly or view it on GitHub https://github.com/golang/go/issues/13192#issuecomment-162146625.

aleek commented 8 years ago

I believe that I've found a way to cross compiler binaries for ppc64be. I'm lookking forward to some feedback. http://dutkowski.me/index.php/2016/02/17/crosscompiling-go-with-c-library-on-powerpc64/

regards Alex

tdolby commented 8 years ago

@laboger I've got to the same point as you have (followed Ian's instructions and now the build can't find .TOC. entries) and it looks like the ELF v2 ABI isn't implemented on ppc64be in any of the gcc versions I've tried (4.4, 4.8, 5) , which is why we don't see .TOC. entries being created.

This can verified by running strings on a generated .o file created using gcc on big- and little-endian and comparing the output. I've tried specifying -mabi=elfv2 on the gcc command line to force it on big-endian, but it's unable to find gnu/stubs-64-v2.h and the compile fails. I've also tried to hack the cgo code to use .toc (lower case), but that's not worked either.

Looks like a rebuild of gcc with ABI v2 might be needed to get things going, or cross-compiling as @aleek has done.

laboger commented 8 years ago

ppc64 BE uses ABI v1 and ppc64 LE uses ABI v2. You can't build gcc on a ppc64 BE machine with ABI v2, because everything else on a ppc64 BE machine expects ABI v1 (dynamic linker, assembler, shared libraries, etc.) The golang for ppc64 BE must generate the linking environment that is consistent with ABI v1 for it to link correctly and run there.

I did some further investigation on this back when I opened the issue which I didn't add to this issue. There is a lot missing in golang for GOARCH=ppc64 with respect to external linking on ABI v1. The call stubs are not correct, the PLT is not set up correctly, plus probably some other things, in addition to the defining of the .TOC. symbol. @tdolby

houstar commented 8 years ago

@laboger How much work If we're enable golang with cgo on ppc64 BE ? I'm curious about it because we're interesting about docker on Power7 Machine that we purchased before. Thanks.

markos commented 7 years ago

@laboger @minux @aclements hi and sorry for tagging all you guys, I've started implementing the ppc64(be) bits on internal linker and am hitting some roadblocks. It's harder than I expected initially, but I am prepared to do the work, just need some more specific directions (yes I know that the TOC entries are wrong etc), but apart from the fact that ABIv1 is more complicated than I thought. I think I saw minux in an older forum claiming to have ppc64be working but the links to the patches were broken (they pointed to the older code.google site). What I've done so far is enabled cgo support in the code, added ppc64 wherever was needed for shared builds/linking, added the R_PPC64_TOC case in adddynrel() in cmd/link/internal/ppc64/asm.go, but symtoc (obviously) fails to find the TOC entry (r.Sym is nil). Then I added the .TOC base entry in the src/cmd/link/internal/ld/elf.go (using DT_PPC64_TOC = 0x8000). So TOC base exists, and I need to add the TOC+symbol offset to the Symbols table, for each symbol. However when I do that (in genplt() allowing when r.Type == 256+ R_PPC64_TOC), I'm getting "unexpected R_PPC64_REL24 for dyn import" in adddynrel(). I guess I'm adding the symbol in the wrong place.

I'd appreciate any help.

jrtc27 commented 7 years ago

@markos what symbol are you referencing? Is it really local or external? Calls to external functions are also done with a standard bl, and it's up to the linker to insert a trampoline to load the actual callee's address and TOC from the function descriptor, the address of which is loaded from the caller's TOC. My guess is that's not being done properly; see https://uclibc.org/docs/psABI-ppc64.pdf, section 3.5.11, for the details.

laboger commented 7 years ago

Have you considered using gccgo? That can be built for ppc64 big endian, configured for several instruction sets, and the linking works. When it is built, it has its own go tool, so once that is in your path it should work for the most part like golang. In some cases, you might need to make some changes to your build scripts (e.g. if you want special options) but that is much less painful than the changes that would be needed in golang to get linking to work on ppc64 big endian.

zeldin commented 6 years ago

@markos Are you still working on this? I could try helping out... Do you have a branch on github with your work?

zte-majiang commented 6 years ago

Hi all, we have implemented ppc64be cgo on golang1.10.3. With it (and a custom GNU toolchain) , We have successfully built docker_17.03.2-ce, and done some basic tests (on our ppc64 e6500 cores). Everything looks fine. I'll try to upstream these patches later (after we get permissions from our boss ...). @laboger @minux @aclements @markos @zeldin

laboger commented 6 years ago

Just curious, why do you need a custom GNU toolchain?

On Thu, Oct 25, 2018 at 1:32 AM zte-majiang notifications@github.com wrote:

Hi all, we have implemented ppc64be cgo on golang1.10.3. With it (and a custom GNU toolchain) , We have successfully built docker_17.03.2-ce, and done some basic tests (on our ppc64 e6500 cores). Everything looks fine. I'll try to upstream these patches later (after we get permissions from our boss ...). @laboger https://github.com/laboger @minux https://github.com/minux @aclements https://github.com/aclements @markos https://github.com/markos @zeldin https://github.com/zeldin

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/13192#issuecomment-432930661, or mute the thread https://github.com/notifications/unsubscribe-auth/AI_wjDwoFeC-7kzJ5YM42NId-F7bjgj2ks5uoVrjgaJpZM4Ge4EV .

majiang31312 commented 6 years ago

Just curious, why do you need a custom GNU toolchain?

The GNU toolchain used in ppc64 cgo tests is maintained internally (also by our team) , that is why I call it "custom". In theory, a standard version should also be OK. But we have not test that case yet.

gopherbot commented 6 years ago

Change https://golang.org/cl/146898 mentions this issue: cgo: add initial support for ppc64

zeldin commented 5 years ago

@zte-majiang congrats on your patch landing in go 1.12. There still seems to be some work to do though:

  1. cmd/link/internal/ld/config.go still has checks which forbid use of external linking on ppc64. By removing these, I was able to build and run a cgo "hello world".
  2. Docker still does not build (tried 18.03.1). After fixing some trivial things I ran into this crash:
    
    Building dynamically linked build/docker-linux-ppc64
    fatal error: unexpected signal during runtime execution
    [signal SIGSEGV: segmentation violation code=0x2 addr=0x7fffd4f83808 pc=0x7fffd4f83808]

runtime stack: runtime: unexpected return pc for runtime.sigpanic called from 0x7fffd4f83808 stack: frame={sp:0x7fffd4f83760, fp:0x7fffd4f837a0} stack=[0x7fffd4784900,0x7fffd4f83970) 00007fffd4f83660: 000000001005a3c4 <runtime.systemstack+212> 000000001002f240 <runtime.printunlock+96> 00007fffd4f83670: 0000000000000001 000000001127049b 00007fffd4f83680: 0000000012339140 000000001002e75c <runtime.throw+92> 00007fffd4f83690: 00007fffd4f83720 0000000010058c08 <runtime.throw.func1+104> 00007fffd4f836a0: 00007fffd4f83720 000000001002e75c <runtime.throw+92> 00007fffd4f836b0: 0000000012339140 000000001002e908 <runtime.fatalthrow+72> 00007fffd4f836c0: 000000001005a3c4 <runtime.systemstack+212> 001a0c47312c9cc9 00007fffd4f836d0: 0000000012339140 000000001002e75c <runtime.throw+92> 00007fffd4f836e0: 00000000112ade36 000000000000002a 00007fffd4f836f0: 000000000000002a 00007fffd4f83700 00007fffd4f83700: 0000000010058c20 <runtime.fatalthrow.func1+0> 0000000012339140 00007fffd4f83710: 000000001002e75c <runtime.throw+92> 00007fffd4f83720 00007fffd4f83720: 000000001004511c <runtime.sigpanic+1228> 010000001002685c 00007fffd4f83730: 000000c0004b0180 000000001003167c <runtime.mcommoninit+204> 00007fffd4f83740: 00007fffd4f83748 0000000010058ba0 <runtime.throw.func1+0> 00007fffd4f83750: 00000000112ade36 000000000000002a 00007fffd4f83760: <00007fffd4f83808 0000000010034630 <runtime.newm1+192> 00007fffd4f83770: 0000000000000000 000000c0004b0300 00007fffd4f83780: 00000000112ade36 000000000000002a 00007fffd4f83790: 00000000000001d0 0000000012339140 00007fffd4f837a0: >00007fffd4f83808 000000c0004b0300 00007fffd4f837b0: 000000c0004bc000 000000c000076f00 00007fffd4f837c0: 00000000100344bc <runtime.newm+140> 0000000012339b20 00007fffd4f837d0: 0000000010034464 <runtime.newm+52> 0000000000000016 00007fffd4f837e0: 000000001229fc90 00007fffd4f83808 00007fffd4f837f0: 0000000000076f00 0000000010fdf560 00007fffd4f83800: 0000000011326fa0 000000c0004b0300 00007fffd4f83810: 000000c0004b2088 00000000100330a0 <runtime.mstart+0> 00007fffd4f83820: 0000000010034a48 <runtime.startm+296> 0000000000000008 00007fffd4f83830: 00000000100349a4 <runtime.startm+132> 0000000010059600 <runtime.newproc.func1+80> 00007fffd4f83840: 000000c0004b2000 0000000000000000 00007fffd4f83850: 000000c0004b2000 000000c0004b2000 00007fffd4f83860: 0000000010034bec <runtime.handoffp+92> 0000000010037b14 <runtime.entersyscallblock_handoff+52> 00007fffd4f83870: 00000000100581dc <runtime.(mheap).alloc.func1+76> 0000000010038098 <runtime.exitsyscallfast_pidle+184> 00007fffd4f83880: 0000000000000000 000000c000076f00 00007fffd4f83890: 000000c000076f00 0000000000000000 runtime.throw(0x112ade36, 0x2a) /usr/lib/go/src/runtime/panic.go:617 +0x5c runtime: unexpected return pc for runtime.sigpanic called from 0x7fffd4f83808 stack: frame={sp:0x7fffd4f83760, fp:0x7fffd4f837a0} stack=[0x7fffd4784900,0x7fffd4f83970) 00007fffd4f83660: 000000001005a3c4 <runtime.systemstack+212> 000000001002f240 <runtime.printunlock+96> 00007fffd4f83670: 0000000000000001 000000001127049b 00007fffd4f83680: 0000000012339140 000000001002e75c <runtime.throw+92> 00007fffd4f83690: 00007fffd4f83720 0000000010058c08 <runtime.throw.func1+104> 00007fffd4f836a0: 00007fffd4f83720 000000001002e75c <runtime.throw+92> 00007fffd4f836b0: 0000000012339140 000000001002e908 <runtime.fatalthrow+72> 00007fffd4f836c0: 000000001005a3c4 <runtime.systemstack+212> 001a0c47312c9cc9 00007fffd4f836d0: 0000000012339140 000000001002e75c <runtime.throw+92> 00007fffd4f836e0: 00000000112ade36 000000000000002a 00007fffd4f836f0: 000000000000002a 00007fffd4f83700 00007fffd4f83700: 0000000010058c20 <runtime.fatalthrow.func1+0> 0000000012339140 00007fffd4f83710: 000000001002e75c <runtime.throw+92> 00007fffd4f83720 00007fffd4f83720: 000000001004511c <runtime.sigpanic+1228> 010000001002685c 00007fffd4f83730: 000000c0004b0180 000000001003167c <runtime.mcommoninit+204> 00007fffd4f83740: 00007fffd4f83748 0000000010058ba0 <runtime.throw.func1+0> 00007fffd4f83750: 00000000112ade36 000000000000002a 00007fffd4f83760: <00007fffd4f83808 0000000010034630 <runtime.newm1+192> 00007fffd4f83770: 0000000000000000 000000c0004b0300 00007fffd4f83780: 00000000112ade36 000000000000002a 00007fffd4f83790: 00000000000001d0 0000000012339140 00007fffd4f837a0: >00007fffd4f83808 000000c0004b0300 00007fffd4f837b0: 000000c0004bc000 000000c000076f00 00007fffd4f837c0: 00000000100344bc <runtime.newm+140> 0000000012339b20 00007fffd4f837d0: 0000000010034464 <runtime.newm+52> 0000000000000016 00007fffd4f837e0: 000000001229fc90 00007fffd4f83808 00007fffd4f837f0: 0000000000076f00 0000000010fdf560 00007fffd4f83800: 0000000011326fa0 000000c0004b0300 00007fffd4f83810: 000000c0004b2088 00000000100330a0 <runtime.mstart+0> 00007fffd4f83820: 0000000010034a48 <runtime.startm+296> 0000000000000008 00007fffd4f83830: 00000000100349a4 <runtime.startm+132> 0000000010059600 <runtime.newproc.func1+80> 00007fffd4f83840: 000000c0004b2000 0000000000000000 00007fffd4f83850: 000000c0004b2000 000000c0004b2000 00007fffd4f83860: 0000000010034bec <runtime.handoffp+92> 0000000010037b14 <runtime.entersyscallblock_handoff+52> 00007fffd4f83870: 00000000100581dc <runtime.(mheap).alloc.func1+76> 0000000010038098 <runtime.exitsyscallfast_pidle+184> 00007fffd4f83880: 0000000000000000 000000c000076f00 00007fffd4f83890: 000000c000076f00 0000000000000000 runtime.sigpanic() /usr/lib/go/src/runtime/signal_unix.go:374 +0x4cc

goroutine 1 [syscall, locked to thread]: runtime.notetsleepg(0x123393d8, 0xffffffffffffffff, 0xc00007b900) /usr/lib/go/src/runtime/lock_futex.go:227 +0x38 fp=0xc000260358 sp=0xc000260318 pc=0x1000ba28 runtime.gcBgMarkStartWorkers() /usr/lib/go/src/runtime/mgc.go:1785 +0xb4 fp=0xc0002603a8 sp=0xc000260358 pc=0x1001d5d4 runtime.gcStart(0x1, 0x0, 0x496000) /usr/lib/go/src/runtime/mgc.go:1264 +0x230 fp=0xc000260430 sp=0xc0002603a8 pc=0x1001c250 runtime.mallocgc(0x10, 0x0, 0x0, 0xc0003f5fe8) /usr/lib/go/src/runtime/malloc.go:1032 +0x3dc fp=0xc0002604f0 sp=0xc000260430 pc=0x1000cb4c runtime.growslice(0x11010140, 0xc0003f5fe8, 0x1, 0x1, 0x2, 0xc0003f5fe8, 0x0, 0x1) /usr/lib/go/src/runtime/slice.go:175 +0x134 fp=0xc000260560 sp=0xc0002604f0 pc=0x10046614 encoding/json.(scanner).pushParseState(...) /usr/lib/go/src/encoding/json/scanner.go:144 encoding/json.stateBeginValue(0xc00048f068, 0x7b000000111e5660, 0x9) /usr/lib/go/src/encoding/json/scanner.go:183 +0x5a0 fp=0xc0002605f8 sp=0xc000260560 pc=0x1032a4e0 encoding/json.checkValid(0xc000456686, 0x76, 0xf77a, 0xc00048f068, 0x110687e0, 0x10000c000491218) /usr/lib/go/src/encoding/json/scanner.go:29 +0xb8 fp=0xc000260638 sp=0xc0002605f8 pc=0x10329c18 encoding/json.Unmarshal(0xc000456686, 0x76, 0xf77a, 0x10fc61c0, 0xc000010800, 0x110abb40, 0x111be620) /usr/lib/go/src/encoding/json/decode.go:100 +0x5c fp=0xc000260690 sp=0xc000260638 pc=0x103195ec github.com/docker/cli/vendor/github.com/go-openapi/spec.(Ref).UnmarshalJSON(0xc000491218, 0xc000456686, 0x76, 0xf77a, 0x7fffa939c1c8, 0xc000491218) /tmp/portage/app-emulation/docker-18.03.1/work/docker-18.03.1/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/ref.go:148 +0x68 fp=0xc000260768 sp=0xc000260690 pc=0x1089da88 encoding/json.(decodeState).object(0xc00048efa0, 0x111be620, 0xc000491218, 0x16, 0xc00048efc8, 0x7b00000010329c18) /usr/lib/go/src/encoding/json/decode.go:611 +0x1e3c fp=0xc000260a48 sp=0xc000260768 pc=0x1031d80c encoding/json.(decodeState).value(0xc00048efa0, 0x111be620, 0xc000491218, 0x16, 0x0, 0xc00048efc8) /usr/lib/go/src/encoding/json/decode.go:381 +0x64 fp=0xc000260ac0 sp=0xc000260a48 pc=0x1031a584 encoding/json.(decodeState).unmarshal(0xc00048efa0, 0x111be620, 0xc000491218, 0xc00048efc8, 0x0) /usr/lib/go/src/encoding/json/decode.go:179 +0x1f4 fp=0xc000260b60 sp=0xc000260ac0 pc=0x10319c34 encoding/json.Unmarshal(0xc000456686, 0x76, 0xf77a, 0x111be620, 0xc000491218, 0x0, 0x0) /usr/lib/go/src/encoding/json/decode.go:106 +0xf8 fp=0xc000260bb8 sp=0xc000260b60 pc=0x10319688 github.com/docker/cli/vendor/github.com/go-openapi/spec.(Schema).UnmarshalJSON(0xc000467680, 0xc000456686, 0x76, 0xf77a, 0x7fffa939c168, 0xc000467680) /tmp/portage/app-emulation/docker-18.03.1/work/docker-18.03.1/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/schema.go:589 +0xb0 fp=0xc000260cb8 sp=0xc000260bb8 pc=0x108a0370 encoding/json.(decodeState).object(0xc0003f68c0, 0x111a1180, 0xc000467680, 0x199, 0xc0003f68e8, 0x7b0000000000000b) /usr/lib/go/src/encoding/json/decode.go:611 +0x1e3c fp=0xc000260f98 sp=0xc000260cb8 pc=0x1031d80c encoding/json.(decodeState).value(0xc0003f68c0, 0x111a1180, 0xc000467680, 0x199, 0xc000490fc0, 0x99) /usr/lib/go/src/encoding/json/decode.go:381 +0x64 fp=0xc000261010 sp=0xc000260f98 pc=0x1031a584 encoding/json.(decodeState).object(0xc0003f68c0, 0x11067ac0, 0xc0004663e0, 0x195, 0xc0003f68e8, 0x7b00000000000000) /usr/lib/go/src/encoding/json/decode.go:763 +0x1320 fp=0xc0002612f0 sp=0xc000261010 pc=0x1031ccf0 encoding/json.(decodeState).value(0xc0003f68c0, 0x11067ac0, 0xc0004663e0, 0x195, 0x11067ac0, 0xc0004663e0) /usr/lib/go/src/encoding/json/decode.go:381 +0x64 fp=0xc000261368 sp=0xc0002612f0 pc=0x1031a584 encoding/json.(decodeState).object(0xc0003f68c0, 0x10fb2240, 0xc000466248, 0x16, 0xc0003f68e8, 0x7b00000010329c18) /usr/lib/go/src/encoding/json/decode.go:763 +0x1320 fp=0xc000261648 sp=0xc000261368 pc=0x1031ccf0 encoding/json.(decodeState).value(0xc0003f68c0, 0x10fb2240, 0xc000466248, 0x16, 0x7fffa939c168, 0xc0003f68e8) /usr/lib/go/src/encoding/json/decode.go:381 +0x64 fp=0xc0002616c0 sp=0xc000261648 pc=0x1031a584 encoding/json.(decodeState).unmarshal(0xc0003f68c0, 0x10fb2240, 0xc000466248, 0xc0003f68e8, 0x0) /usr/lib/go/src/encoding/json/decode.go:179 +0x1f4 fp=0xc000261760 sp=0xc0002616c0 pc=0x10319c34 encoding/json.Unmarshal(0xc000456000, 0x9c54, 0xfe00, 0x10fb2240, 0xc000466248, 0x7fffa939c168, 0xc0003f6848) /usr/lib/go/src/encoding/json/decode.go:106 +0xf8 fp=0xc0002617b8 sp=0xc000261760 pc=0x10319688 github.com/docker/cli/vendor/github.com/go-openapi/spec.(Schema).UnmarshalJSON(0xc000466000, 0xc000456000, 0x9c54, 0xfe00, 0x7fffa939c168, 0xc000466000) /tmp/portage/app-emulation/docker-18.03.1/work/docker-18.03.1/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/schema.go:586 +0x6c fp=0xc0002618b8 sp=0xc0002617b8 pc=0x108a032c encoding/json.(decodeState).object(0xc0003f6820, 0x1124aaa0, 0xc000466000, 0x16, 0xc0003f6848, 0x7b00000010329c18) /usr/lib/go/src/encoding/json/decode.go:611 +0x1e3c fp=0xc000261b98 sp=0xc0002618b8 pc=0x1031d80c encoding/json.(decodeState).value(0xc0003f6820, 0x1124aaa0, 0xc000466000, 0x16, 0x1034d0ec, 0xc0003f6848) /usr/lib/go/src/encoding/json/decode.go:381 +0x64 fp=0xc000261c10 sp=0xc000261b98 pc=0x1031a584 encoding/json.(*decodeState).unmarshal(0xc0003f6820, 0x1124aaa0, 0xc000466000, 0xc0003f6848, 0x0) /usr/lib/go/src/encoding/json/decode.go:179 +0x1f4 fp=0xc000261cb0 sp=0xc000261c10 pc=0x10319c34 encoding/json.Unmarshal(0xc000456000, 0x9c54, 0xfe00, 0x1124aaa0, 0xc000466000, 0x0, 0x0) /usr/lib/go/src/encoding/json/decode.go:106 +0xf8 fp=0xc000261d08 sp=0xc000261cb0 pc=0x10319688 github.com/docker/cli/vendor/github.com/go-openapi/spec.Swagger20Schema(0xe6d3deeee71785a1, 0xc00038fba0, 0xe60000c000401590) /tmp/portage/app-emulation/docker-18.03.1/work/docker-18.03.1/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/spec.go:75 +0xa4 fp=0xc000261d80 sp=0xc000261d08 pc=0x108a0b34 github.com/docker/cli/vendor/github.com/go-openapi/spec.MustLoadSwagger20Schema(...) /tmp/portage/app-emulation/docker-18.03.1/work/docker-18.03.1/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/spec.go:59 github.com/docker/cli/vendor/github.com/go-openapi/spec.initResolutionCache(0x11065a80, 0xc000401500) /tmp/portage/app-emulation/docker-18.03.1/work/docker-18.03.1/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/expander.go:44 +0x28 fp=0xc000261de8 sp=0xc000261d80 pc=0x1089c748 github.com/docker/cli/vendor/github.com/go-openapi/spec.init.ializers() /tmp/portage/app-emulation/docker-18.03.1/work/docker-18.03.1/src/github.com/docker/cli/vendor/github.com/go-openapi/spec/expander.go:40 +0x2d4 fp=0xc000261e68 sp=0xc000261de8 pc=0x108a2134 github.com/docker/cli/vendor/github.com/go-openapi/spec.init()

:1 +0xa8 fp=0xc000261e88 sp=0xc000261e68 pc=0x108a2448 github.com/docker/cli/vendor/k8s.io/apimachinery/pkg/api/resource.init() :1 +0x84 fp=0xc000261ea8 sp=0xc000261e88 pc=0x108dc1e4 github.com/docker/cli/vendor/k8s.io/apimachinery/pkg/apis/meta/v1.init() :1 +0x70 fp=0xc000261ec8 sp=0xc000261ea8 pc=0x10962fb0 github.com/docker/cli/kubernetes/compose/v1beta1.init() :1 +0x60 fp=0xc000261ee8 sp=0xc000261ec8 pc=0x10971940 github.com/docker/cli/cli/command/stack/kubernetes.init() :1 +0x64 fp=0xc000261f08 sp=0xc000261ee8 pc=0x10ec40c4 github.com/docker/cli/cli/command/stack.init() :1 +0x6c fp=0xc000261f28 sp=0xc000261f08 pc=0x10ed817c github.com/docker/cli/cli/command/commands.init() :1 +0x94 fp=0xc000261f48 sp=0xc000261f28 pc=0x10efdf74 main.init() :1 +0x78 fp=0xc000261f68 sp=0xc000261f48 pc=0x10f0f948 runtime.main() /usr/lib/go/src/runtime/proc.go:188 +0x1fc fp=0xc000261fc0 sp=0xc000261f68 pc=0x100300bc runtime.goexit() /usr/lib/go/src/runtime/asm_ppc64x.s:857 +0x4 fp=0xc000261fc0 sp=0xc000261fc0 pc=0x1005c9d4 goroutine 5 [syscall]: os/signal.signal_recv(0x0) /usr/lib/go/src/runtime/sigqueue.go:139 +0xf8 os/signal.loop() /usr/lib/go/src/os/signal/signal_unix.go:23 +0x24 created by os/signal.init.0 /usr/lib/go/src/os/signal/signal_unix.go:29 +0x3c ``` Have you tried any Docker versions later than 17.03.2? Do they build for you? Thanks.
zte-majiang commented 5 years ago

Have you tried any Docker versions later than 17.03.2? Do they build for you? Thanks. HI,@zeldin We test and use "go1.10.3 + docker 17.03" internally. Nothing else for the moment. Could you please add some more descriptions about this problem? Your golang is 1.12 release( or from the master branch)? How can we reproduce this problem ? Build the docker 18.03.1(or run it)?

zeldin commented 5 years ago

Yes, I tried the golang 1.12 and 1.12.1 releases, via Gentoo ebuilds (and applying a small patch to enable external linking). I also patched Docker to remove the -buildmode=pie flag since golang also rejects that flag on ppc64 and I figured it wasn't needed. The error occurred during build; I never got so far as trying to run anything (although evidently the build runs built code internally). I used the Gentoo ebuild for Docker as well, I can try a manual build from an upstreams source package later. I'm not sure if the problem is even related to cgo; it's the garbage collector which crashes, but it could still be related to cgo

The error happens with both 18.03.1 and 18.09.3. There is no current ebuild for 17.03, but I can try a manual build.

zeldin commented 5 years ago

Some more information: The crash happens when running gen-manpages, but only if using the ebuild. It does not happen if I run the same build commands manually. Very weird. At any rate, it does not seem related to cgo then. :confused:

zeldin commented 5 years ago

After further investigation, I found that the crash is triggered by the following line in the ebuild:

    export CGO_CFLAGS="-I${ROOT}/usr/include"

It's not adding the include path that's the issue, but rather the overriding of the default CGO_CFLAGS, which is -g -O2. Specifically, removing -O2 from CGO_CFLAGS triggers the crash. Changing the assignment to CGO_CFLAGS="-I${ROOT}/usr/include -g -O2" or CGO_CFLAGS="-I${ROOT}/usr/include -O2" both make gen-manpages run.

So I reverse my previous statement -- the crash does seem cgo related, but specific to unoptimized cgo.

zte-majiang commented 5 years ago

@zeldin , I can not reproduce this bug via a simple cgo program. Could you send your "gen-manpages" (and related shared libs) binary to me ?

zeldin commented 5 years ago

@zte-majiang I also can't reproduce with a trivial program. Maybe it needs to be complex enough that the garbage collector needs to run? I dunno... I've put the binary up here: https://mc.pp.se/gen-manpages.tar.xz There are no special shared libs needed, only glibc and pthread. No input or command line switches are needed for the program to crash. Thanks for looking into it!

zte-majiang commented 5 years ago

@zte-majiang I also can't reproduce with a trivial program. Maybe it needs to be complex enough that the garbage collector needs to run? I dunno... I've put the binary up here: https://mc.pp.se/gen-manpages.tar.xz There are no special shared libs needed, only glibc and pthread. No input or command line switches are needed for the program to crash. Thanks for looking into it!

@zeldin I believe that I have reproduced the bug using a simple hello. It seemed much more complex than I thought. The main blocker is that golang think the minimum stack size is 32 while gcc think that should be 48( Yes, it is another ABI conflict). There are quite a lot of codes need to adjust... Moreover, I have no ppc64 machine for the moment (and to debug using qemu is extremely frustrating). So this may take some time to make the first fix. Hope I can finish it this week...

zte-majiang commented 5 years ago

@zeldin , I have made a dirty fix against trunk. See attached file. Could you try it in your environments? In my machine, it fixed all segfaults in my hello world. ppc64_cgofix.zip

zeldin commented 5 years ago

@zte-majiang Awesome! :balloon: I'll try it out later and let you know the result.

zeldin commented 5 years ago

@zte-majiang Ok, first I tried applying your patch to go-1.12.1. asm_ppc64x.s looked a bit different so I had to apply some of the changes manually. Maybe I messed up somehow, but the result was not great. Instead of a SIGSEGV I now get

runtime: out of memory: cannot allocate 5476375846579085312-byte block (66256896 in use)

I do have a fair bit of RAM, but not that much. :smile:

Next I tried with go-9999 (a.k.a. current git master). Your patch applied cleanly, but now when I try to build docker I get this:

can't load package: package github.com/docker/docker/cmd/dockerd: unknown import path "github.com/docker/docker/cmd/dockerd": cannot find module providing package github.com/docker/docker/cmd/dockerd

Bizzare. The package can't find itself? Anyway, this happens much earlier than the point where I would get the crash before, so maybe it's not related to cgo at all, but just to some other issue with current git head?

zte-majiang commented 5 years ago

@zeldin Sorry to hear this... I'll try to push the patch into trunk, first. Anyway these problems need to be fixed. As for go-1.12.1, I can made a special patch for it later. Should be relatively easy, I suppose...

zeldin commented 5 years ago

@zte-majiang Ok, sounds like a plan. Thanks!

zte-majiang commented 5 years ago

@zeldin I have rebuild the fix for go1.12.1, see attached file. Please try it and let me know whether it's ok for you, thanks. goppc64-112.zip

zeldin commented 5 years ago

@zte-majiang Thanks! With this patch applied to 1.12.1, docker builds fine without any interventions necessary. :100:

ianlancetaylor commented 5 years ago

@zte-majiang Do you want to try to update your changes for master and send a pull request? Thanks.

zte-majiang commented 5 years ago

@zeldin Glad to hear the good news! Thanks. @ianlancetaylor Yes, I will try to push the fix to trunk. Thanks for inviting.

gopherbot commented 5 years ago

Change https://golang.org/cl/174317 mentions this issue: cmd/link, runtime: reserve 48 bytes in stack frame for ppc64

mirzak commented 4 years ago

I have some interesting in using cgo on ppc64 and I am curious on the status of this or any further plans? @zte-majiang

@zeldin , do you have something that works on more recent releases?

I tried forward porting #31738 to go1.15.3, but I get illegal instruction errors during runtime, e.g docker:

dockerd[140]: illegal instruction (4) at 10049504 nip 10049504 lr 1006c7bc code 1 in dockerd[10000000+29ce000]
dockerd[140]: code: 7c230840 41800010 7ca802a6 480235a1 4bffffec 7fe802a6 fbe1ff91 90010060
dockerd[140]: code: 3fe011d8 c03f0708 d0210058 d0210050 <f0210cd0> d8210068 90010048 3fe011d8

So either new things have been introducerad in go that need changes or I am doing something wrong.

laboger commented 4 years ago

Usually an illegal instruction means your hardware does not support the minimum instruction set used by Golang. If you look in /proc/cpuinfo, what is the cpu?