Open Jimbly opened 7 years ago
I do not get linker errors when I compile your test project. Please attach a working and a broken static archive.
$ cat ./test.sh
#!/bin/bash
vers="32 33 34 35 36 37 38 39 40 50"
for v in $vers; do
export PATH=/opt/compiler/llvm$v/bin:$PATH
echo
x86_64-apple-darwin16-cc --version | grep version
./simple_make_osxcross.sh
done
$ ./test.sh
clang version 3.2 (tags/RELEASE_32/final)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 3.3 (tags/RELEASE_33/final)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 3.4.2 (tags/RELEASE_34/dot2-rc1)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 3.5.2 (tags/RELEASE_352/final)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 3.6.2 (tags/RELEASE_362/final)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 3.7.1 (branches/release_37 256704)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 3.8.0 (https://github.com/llvm-mirror/clang.git 9fd77bd68130d9b2fbc56a3138b6f981d560480a) (https://github.com/llvm-mirror/llvm.git ad5750369cc5b19f36c149f7b13151c99c7be47a)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 3.9.0 (tags/RELEASE_390/final)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 4.0.0 (https://github.com/llvm-mirror/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (https://github.com/llvm-mirror/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
clang version 5.0.0 (tags/RELEASE_500/final)
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin16-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ '[' -e libglew.a ']'
+ rm libglew.a
+ x86_64-apple-darwin16-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin16-ranlib libglew.a
+ x86_64-apple-darwin16-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin16-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
Interesting! Well, I've uploaded binaries here: https://github.com/Jimbly/osxcross-test/tree/master/SuperSimple/binaries
If you want to use Docker, I can upload my whole state for you to poke around with if that would help.
Looks like we are on a slightly different clang version (your build_clang.sh, which I ran when setting up my environment, builds 3.9.1 by default, I think?)
x86_64-apple-darwin14-cc --version | grep version
clang version 3.9.1 (tags/RELEASE_391/final)
Yes, 3.9.1 is the default version for now.
Are you sure you have used the exact same object file for both archives?
Absolutely! I do see that the extracted file differs from the one put in, at least the first time:
# rm libglew.a
# cp glew.c.o glew.c.o.orig
# ar rq libglew.a glew.c.o
# ar -x libglew.a glew.c.o
# diff glew.c.o glew.c.o.orig
Binary files glew.c.o and glew.c.o.orig differ
# cp glew.c.o.orig glew.c.o
# ar rq libglew.a glew.c.o
# ar -x libglew.a glew.c.o
# diff glew.c.o glew.c.o.orig
#
Okay, sorry I do not use Docker. What distribution (and version) are you on?
Actually, not 100% certain the versions used to make the binaries I uploaded were identical (if CC produced different results, then they'd have been different). I've updated the checked in binaries to have the source object, and the good and bad libraries, simply made with:
ar rq libglew.a glew.c.o
cp libglew.a libglew.a.bad
ar rq libglew.a glew.c.o
mv libglew.a libglew.a.good
Linux version is from the "buildpack-deps:jessie-curl" Docker repo, which is "Debian Jessie" or "Debian 8", I guess (though incredibly trimmed down).
# uname -a
Linux b416e4ea453c 4.4.83-boot2docker #1 SMP Fri Aug 18 17:31:15 UTC 2017 x86_64 GNU/Linux
# cat /etc/issue
Debian GNU/Linux 8 \n \l
I hadn't thought it might be distro related, but I'll try this on ubuntu:xenial and see if it behaves differently (takes quite a while to build in a VM though, so I won't know until tomorrow, probably).
Not the distribution but the compiler might be the problem (or undefined behavior in cctools). Did you compile OSXCross with clang?
Everything I did to set up my instance is here: https://gist.github.com/Jimbly/105538265da42682488ada15ba477a77
Build Clang, install it, then build OSXCross, so I think so!
Edit: Ignore the last part of that Dockerfile where I replace x64_64-apple-darwin14-ar
, that's my workaround where I have a simple script that runs the ar
commands twice - I'm disabling that while testing this stuff in this thread.
I think I tried without building Clang the first time, and tried using https://github.com/multiarch/crossbuild before that (all with the same results). multiarch/crossbuild also uses buildpack-deps:jessie-curl
as its base OS image, so if something is wrong with cctools or something there, I'm using the same one, maybe the attempt in a recent Ubuntu image will be successful, we'll see =).
Tried this with an Ubuntu-based base Docker image, still no luck, having the exact same behavior. This was without building clang 3.9.1 first, just using the clang in the apt repositories, I can try with building clang locally as well (though that takes much longer, I think that's the step that will take until tomorrow ; ).
Does the OSX SDK version have anything to do with x64_86-apple-darwin14-ar
, or is this built independently? I'm using the packaged SDK that's used in multiarch/crossbuild, so maybe there's an issue with that?
Does the OSX SDK version have anything to do with x64_86-apple-darwin14-ar, or is this built independently?
No. I will try to investigate.
Fresh build of Clang 3.9.1 and then OSXCross on an Ubuntu Xenial Docker image finished, and I have the same erroneous result.
I installed Ubuntu Xenial in a VM but I still cannot reproduce this.
$ uname -a
Linux thomas-VirtualBox 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
thomas@thomas-VirtualBox:~/tmp/osxcross-test/SuperSimple$ ./simple_make_osxcross.sh
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin15-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ [ -e libglew.a ]
+ rm libglew.a
+ x86_64-apple-darwin15-ar cq libglew.a glew.c.o
+ x86_64-apple-darwin15-ranlib libglew.a
+ x86_64-apple-darwin15-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin15-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
thomas@thomas-VirtualBox:~/tmp/osxcross-test/SuperSimple$ ./simple_make_osxcross_works.sh
+ mkdir -p build-darwin-64
+ cd build-darwin-64
+ x86_64-apple-darwin15-cc -DGLEW_NO_GLU -DGLEW_STATIC -I.. -o glew.c.o -c ../glew/glew.c
+ [ -e libglew.a ]
+ rm libglew.a
+ x86_64-apple-darwin15-ar r libglew.a glew.c.o
x86_64-apple-darwin15-ar: creating archive libglew.a
+ x86_64-apple-darwin15-ar r libglew.a glew.c.o
+ x86_64-apple-darwin15-ranlib libglew.a
+ x86_64-apple-darwin15-c++ -stdlib=libc++ -DGLM_COMPILER=0 -std=gnu++11 -o simple_test.cpp.o -c ../simple_test.cpp
+ x86_64-apple-darwin15-c++ -stdlib=libc++ simple_test.cpp.o -o SimpleTest libglew.a
I tried this on a physical Ubuntu server, and it also worked fine (but, using the clang 3.8 that apt installed, so, not quite the same), so that rules out the problem being anything with the OSX SDK version or anything like that which we have slightly differently. At least I have one working version now! I'll try and narrow down any differences, who knows, maybe it's a weird problem interacting with Docker or something, though that would seem utterly bizarre (and frightening, if I'm relying on Docker for other things ; ). Thanks for your help so far.
Might be undefined behavior in ar
. Wouldn't be first time.
https://github.com/tpoechtrager/cctools-port/commit/5e5b511a3eebd3fa9ece49bead85a1be29a25dfb https://github.com/tpoechtrager/cctools-port/commit/5098fc9f6dfba7eac18f5ae55b7e361314e0fb7b
Try to run ar
within valgrind
and/or build cctools with -fsanitize=address
.
CXX="clang++ -fsanitize=address" CC="clang -fsanitize=address" ./build.sh
Definitely looking like undefined behavior - if I copy my ar
and ranlib
I built that seemed to work into my Docker image, and run it there, they have the bad behavior.
I tried running with valgrind
(I'm unfamiliar with it in general), seems it just displays output about possible memory leaks, finds no errors (though, it's probably just valgrinding ar
and not the ranlib
child process...).
It seems the version of clang I built with build_clang.sh
does not include everything necessary for -fsanitize=address, so I tried installing things from apt
instead, result when I run ar
now is some memory leaks reported when it runs ranlib
internally, which I guess returns a failure, so ar
gives up:
x86_64-apple-darwin14-ar r test.a glew.c.o
=================================================================
==5582==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 384 byte(s) in 1 object(s) allocated from:
#0 0x4a037b (/C/SRC/osxcross/x86_64-apple-darwin14-ranlib+0x4a037b)
... <snip>
SUMMARY: AddressSanitizer: 400 byte(s) leaked in 3 allocation(s).
/C/SRC/osxcross/x86_64-apple-darwin14-ar: internal ranlib command failed
But, I don't think these "memory leaks" matter, not sure if this is hiding some actual checking that would be useful, or if this is just a wild goose chase =).
Digging through the source a bit, narrowed it down a bit - apparently ar r
launches off ranlib -q
, can skip that and do the steps manually with ar rS
, and then the sequence of events looks like this:
ar rS test.a glew.c.o # extracts equal to original after this
ranlib -q test.a # extracts different from original after this
ar rS test.a glew.c.o # extracts equal to original after this
ranlib -q test.a # extracts equal to original after this
Unfortunately neither running valgrind nor building with -fsanitize=address seems to find anything wrong with ranlib
, although I'm not 100% certain I managed to build ranlib with -fsanitize=address (I modified libstuff/Makefile, ar/Makefile, and misc/Makefile to add that, seems to work, though I could be missing something, and my Makefile-foo is weak =). It does print out LeakSanitizer: detected memory leaks
though, so it seems to be working.
However....
Using my new understanding of how these work, I see that, even in my "working" real Ubuntu environment, that ranlib -q
step does modify the glew.c.o inside of the archive, however, even after that, everything links fine. I think for some of this I might have been chasing a red herring - perhaps ranlib
is supposed to modify the .o file stored in the .a library in this case, and the fact that extracting my .o file gives me something different than I put in is not actually the problem? That being said, of my four library files - on [Real Unbuntu, Docker], running ar
[once, twice], only docker-once causes this link error, and docker-twice == ubuntu-twice (except what looks like timestamps in the header), so, it still seems the result of ranlib
in my Docker instance (even if it's the same ranlib
I copied from real Ubuntu) is the point of failure.
Does llvm-ar
work for you?
llvm-ar
does not help, though if I also switch to llvm-ranlib
as well, then the problem seems to go away (have to do both, either one by itself doesn't help, which makes sense since ar
calls ranlib
internally). I guess I could just replace x86_64-apple-darwinX-ar/ranlib with the llvm- versions instead of the my current hacky workaround of calling ar
twice. Still not sure what the root of the problem is, though.
Getting this link error:
Doesn't happen for all static libraries, but does for "glew", which is a pretty straightforward (if large) single-file library. I've created a super simple reproduction case here: https://github.com/Jimbly/osxcross-test/tree/master/SuperSimple
I've got 3 make scripts there (pruned down from what CMake generated): one for running natively on macOS (works fine), the same but modified to use x86_64-apple-darwin14-* versions of the commands (fails), and a third which weirdly works around it.
As far as I can tell, the problem is with
ar
. If I runar r
twice to create the library and then replace the internal file with an identical file, the problem magically goes away.If I build just the library on native macOS, everything links fine in osxcross.
A sibling to that is a more complicated actual project: https://github.com/Jimbly/osxcross-test/tree/master/CMakeRunnable set up with CMake which, on both Linux and OSX, and runs fine when built natively, but fails to build using osxcross (requires a bit of poking to get CMake to work with OSXCross though). I have no idea how to trick CMake into generating two
ar r
statements instead of onear qc
though, so I'm not sure how to apply my workaround, but it seems like this is a bug that should be fixed (whether here or in clang or a, I have no idea).I've tried building CLang 3.9.1 following the instructions here and still had no luck.