JuliaCI / julia-buildbot

Buildbot configuration for build.julialang.org
MIT License
19 stars 14 forks source link

Nightlies do not contain LLVM patch #67

Closed maleadt closed 6 years ago

maleadt commented 6 years ago

I've been running into some issues with LLVM.jl, with tests segfaulting on something I've definitely fixed on master. After some debugging, I think the nightlies do not contain that patch.

For example, take my local build of LLVM without JuliaLang/julia#25794, where the assembly of LLVMGetAttributeCountAtIndex (the function where LLVM.jl segfaults) looks like:

Dump of assembler code for function LLVMGetAttributeCountAtIndex:
   0x0000000000503630 <+0>:     sub    $0x18,%rsp
   0x0000000000503634 <+4>:     mov    %fs:0x28,%rax
   0x000000000050363d <+13>:    mov    %rax,0x8(%rsp)
   0x0000000000503642 <+18>:    xor    %eax,%eax
   0x0000000000503644 <+20>:    mov    0x98(%rdi),%rax
   0x000000000050364b <+27>:    mov    %rsp,%rdi
   0x000000000050364e <+30>:    mov    %rax,(%rsp)
   0x0000000000503652 <+34>:    callq  0x3815f0 <_ZNK4llvm12AttributeSet13getAttributesEj@plt>
   0x0000000000503657 <+39>:    mov    0x8(%rsp),%rdx
   0x000000000050365c <+44>:    xor    %fs:0x28,%rdx
   0x0000000000503665 <+53>:    mov    0x8(%rax),%eax
   0x0000000000503668 <+56>:    jne    0x50366f <LLVMGetAttributeCountAtIndex+63>
   0x000000000050366a <+58>:    add    $0x18,%rsp
   0x000000000050366e <+62>:    retq   
   0x000000000050366f <+63>:    callq  0x3876c0 <__stack_chk_fail@plt>

Applying the patch transforms the assembly into:

Dump of assembler code for function LLVMGetAttributeCountAtIndex:
   0x0000000000503630 <+0>:     sub    $0x18,%rsp
   0x0000000000503634 <+4>:     mov    %fs:0x28,%rax
   0x000000000050363d <+13>:    mov    %rax,0x8(%rsp)
   0x0000000000503642 <+18>:    xor    %eax,%eax
   0x0000000000503644 <+20>:    mov    0x98(%rdi),%rax
   0x000000000050364b <+27>:    mov    %rsp,%rdi
   0x000000000050364e <+30>:    mov    %rax,(%rsp)
   0x0000000000503652 <+34>:    callq  0x3815f0 <_ZNK4llvm12AttributeSet13getAttributesEj@plt>
   0x0000000000503657 <+39>:    xor    %edx,%edx
   0x0000000000503659 <+41>:    test   %rax,%rax
   0x000000000050365c <+44>:    je     0x503661 <LLVMGetAttributeCountAtIndex+49>
   0x000000000050365e <+46>:    mov    0x8(%rax),%edx
   0x0000000000503661 <+49>:    mov    0x8(%rsp),%rcx
   0x0000000000503666 <+54>:    xor    %fs:0x28,%rcx
   0x000000000050366f <+63>:    mov    %edx,%eax
   0x0000000000503671 <+65>:    jne    0x503678 <LLVMGetAttributeCountAtIndex+72>
   0x0000000000503673 <+67>:    add    $0x18,%rsp
   0x0000000000503677 <+71>:    retq   
   0x0000000000503678 <+72>:    callq  0x3876c0 <__stack_chk_fail@plt>

Note the test %rax,%rax and jump over mov 0x8(%rsp),%edx, corresponding with the check for a returned null pointer.

Meanwhile, on today's nightly (0.7.0-DEV.3998, 4371808c4e) we seen the following IR:

Dump of assembler code for function LLVMGetAttributeCountAtIndex:
   0x00000000004effc0 <+0>:     sub    $0x18,%rsp
   0x00000000004effc4 <+4>:     mov    0x98(%rdi),%rax
   0x00000000004effcb <+11>:    lea    0x8(%rsp),%rdi
   0x00000000004effd0 <+16>:    mov    %rax,0x8(%rsp)
   0x00000000004effd5 <+21>:    callq  0x37d390 <_ZNK4llvm12AttributeSet13getAttributesEj@plt>
   0x00000000004effda <+26>:    mov    0x8(%rax),%eax
   0x00000000004effdd <+29>:    add    $0x18,%rsp
   0x00000000004effe1 <+33>:    retq

Quite a bit cleaner (no stack protector?), but more notably no test for a null pointer. So it seams the patch to LLVM has not been applied?

cc @staticfloat as per template

staticfloat commented 6 years ago

Is this the linux64 nightly you're looking at?

maleadt commented 6 years ago

Yes. but the Travis failure also happens on osx: https://travis-ci.org/maleadt/LLVM.jl/jobs/342490156#L137

ararslan commented 6 years ago

We could try a complete git clean -fdx and build from scratch.

staticfloat commented 6 years ago

Hmmm, the linux64 binaries are definitely recent enough, and logging in to them, I can verify that the llvm-3.9-c_api_nullptr.patch effects have been applied to the source code....... perhaps there is some bug in the make system that is causing the source changes to not propagate out into changes in the actual library file. I'll nuke it and see if that changes anything for you.

maleadt commented 6 years ago

perhaps there is some bug in the make system that is causing the source changes to not propagate out into changes in the actual library file

So we are building incrementally?

$ make -C deps compile-llvm 
make: Entering directory '/home/tbesard/Julia/julia-dev/build/dist/deps'
make: Nothing to be done for 'compile-llvm'.
make: Leaving directory '/home/tbesard/Julia/julia-dev/build/dist/deps'
$ touch ../../deps/srccache/llvm-3.9.1/lib/IR/Core.cpp 
$ make -C deps compile-llvm 
make: Entering directory '/home/tbesard/Julia/julia-dev/build/dist/deps'
make: Nothing to be done for 'compile-llvm'.
make: Leaving directory '/home/tbesard/Julia/julia-dev/build/dist/deps'

ie. no rebuild. Tested with actual changes too.

I thought @vtjnash had mentioned we use ccache with clean builds now, presumable to avoid such issues.

staticfloat commented 6 years ago

We haven't quite switched over to that yet (I did some proof of concepts to show how nice that would be) because our Windows build system can't handle that yet. I would really like to switch over to building on linux with wine, but bootstrap fails during that segment so we can't use ccache + cross-compilers. Yet.

maleadt commented 6 years ago

OK thanks, LLVM.jl now works again with Linux nightlies. I hope the patch will be part of the other nightlies too, once those get updated?