Open StefanSalewski opened 4 years ago
I have indeed the feeling that my ar is broken.
stefan@nuc /tmp/www $ ls -lt
total 0
stefan@nuc /tmp/www $ echo "xxx 123" > xxx.o
stefan@nuc /tmp/www $ cat xxx.o
xxx 123
stefan@nuc /tmp/www $ ar -q yyy.a xxx.o
ar: creating yyy.a
Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!
Segmentation fault (core dumped)
I am not really sure, as I have never used ar myself before. If it is broken, then first question is why, and next question is how I can fix it.
Well I just discovered that a gcc-ar exists, and that seems to work. So I hope I can link ar to gcc-ar to fix it.
stefan@nuc /tmp/www $ lt
total 0
stefan@nuc /tmp/www $ echo "xxx 123" > xxx.o
stefan@nuc /tmp/www $ cat xxx.o
xxx 123
stefan@nuc /tmp/www $ gcc-ar -q yyy.a xxx.o
/usr/lib/gcc/x86_64-pc-linux-gnu/10.0.1/../../../../x86_64-pc-linux-gnu/bin/ar: creating yyy.a
stefan@nuc /tmp/www $ ls -lt
total 8
-rw-r--r-- 1 stefan stefan 76 Mar 10 12:50 yyy.a
-rw-r--r-- 1 stefan stefan 8 Mar 10 12:49 xxx.o
stefan@nuc /tmp/www $ which gcc-ar
/usr/bin/gcc-ar
stefan@nuc /tmp/www $ ls -lt /usr/bin/gcc-ar
lrwxrwxrwx 1 root root 46 Mar 9 19:34 /usr/bin/gcc-ar -> /usr/x86_64-pc-linux-gnu/gcc-bin/10.0.1/gcc-ar
I had to do the same for ranlib manually (maybe for other binutils tools too?)
/usr/bin # ls -lt
x86_64-pc-linux-gnu-ranlib -> x86_64-pc-linux-gnu-gcc-ranlib
x86_64-pc-linux-gnu-ar -> /usr/bin/gcc-ar
I guess that these links got broken somehow (not pointing to the gcc version) and it seems that gcc-config or eselect gcc do not fix the links when broken.
At least now it seems to work again, I was able to emerge net-misc/openssh again!
Well, the nm link is still wrong:
31 Feb 15 11:39 /usr/bin/x86_64-pc-linux-gnu-nm -> /usr/x86_64-pc-linux-gnu/bin/nm
So the accident happened on Feb 15 -- but I still wonder why.
Well, seems that the problem was and still is
# emerge -av binutils
[ebuild R ] sys-devel/binutils-2.33.1-r1:2.33::gentoo USE="gold nls plugins -default-gold -doc -multitarget -static-libs -test" 0 KiB
which results again in
Mar 11 06:45 x86_64-pc-linux-gnu-ar -> /usr/x86_64-pc-linux-gnu/bin/ar
and ar stops working.
"eselect binutils set" does not fix the issue, it creates the links to the non gcc versions too.
I ran into this while building both clang and binutils. -amdgpu-argument-reg-usage-info appears to be an LLVM flag, presumably when LLVM is built with LLVM_TARGETS="AMDGPU"
. But the active toolchain was built using entirely GNU. When I unmerged llvm, all build problems disappeared. Maybe binutils or gcc components are somehow dynamically linking against llvm libraries?
Running the offending ar
command in gdb shows that a function in /usr/lib64/binutils/x86_64-pc-linux-gnu/2.33.1/libbfd-2.33.1.gentoo-sys-devel-binutils-st.so
is calling a function in /usr/x86_64-pc-linux-gnu/binutils-bin/2.33.1/../2.33.1/../lib/bfd-plugins/LLVMgold.so
. I don't have gold set as my linker and nothing was built with llvm. Building with -fuse-ld=bfd
has no effect.
Thank you very much for your investigations. I can not comment on the core of this issue as I do know not much about binutils and ar internals. But I am happy that my box is running well again after manually setting the ar link to gcc-ar.
The problem appears fixed in git HEAD with binutils-9999.
Great. Then I will close this issue in the next week.
Never mind. I spoke too soon. It popped up again. I unmerged llvm-10/clang-10/llvmgold-10 and have no problems with llvm-9/clang-9/llvmgold-9.
I have a new problem now, emerging sys-libs/glibc-2.30-r6 fails. Reason is a different ar call as
/usr/lib/gcc/x86_64-pc-linux-gnu/10.0.1/../../../../x86_64-pc-linux-gnu/bin/ar
which is
cd /usr/lib/gcc/x86_64-pc-linux-gnu/10.0.1/../../../../x86_64-pc-linux-gnu/bin/ nuc /usr/x86_64-pc-linux-gnu/bin # pwd /usr/x86_64-pc-linux-gnu/bin
ar -> /usr/x86_64-pc-linux-gnu/binutils-bin/2.33.1/ar
And fixing this link manually does not work, I think I get a loop of symlinks when I try to fix it.
I assume that gcc-ar is not an executable for it own, but calls this link too.
I think I have completely removed clang10, but that is not enough. I guess I have to reemerge some tools, maybe emerge binutils-9999? May that help? Or better reemerge binutils without LTO? It is a bit dangerous of course, I may get a situation where all is completely broken, and I would have to switch back to my backup partition without LTO.
emerge -av binutils-libs binutils
for version 2.34 does not fix the problem. But what is interesting is that ar is working fine when we give it --plugin argument:
stefan@nuc /tmp $ /usr/x86_64-pc-linux-gnu/binutils-bin/2.34/ar -q yyy.a xxx.o
Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!
Segmentation fault (core dumped)
stefan@nuc /tmp $ /usr/x86_64-pc-linux-gnu/binutils-bin/2.34/ar --plugin=/usr/libexec/gcc/x86_64-pc-linux-gnu/10.0.1/liblto_plugin.so -q yyy.a xxx.o
stefan@nuc /tmp $
Maybe related, and there is a suggested fix:
I think finally I found the cause for the real problem:
$ ls -lt /usr/x86_64-pc-linux-gnu/binutils-bin/lib/bfd-plugins
total 8
lrwxrwxrwx 1 root root 60 Mar 10 16:29 liblto_plugin.so -> /usr/libexec/gcc/x86_64-pc-linux-gnu/10.0.1/liblto_plugin.so
lrwxrwxrwx 1 root root 41 Mar 9 13:49 LLVMgold.so -> ../../../../lib/llvm/10/lib64/LLVMgold.so
So on 9 MAR an failed attempt to install clang10 created a link in /usr/x86_64-pc-linux-gnu/binutils-bin/lib/bfd-plugins to LLVMgold.so of version 10, which was not working. And all my tries to uninstall clang10 have not reset that link. I have now manually reset it to clang9, and now I was able to install glibc again, and I hope my whole box works again.
Maybe a reinstall of clang9 and llvm9 would have fixed that automatically?
I was not aware that clang can break gcc, I have considered both indepantly in the past.
wow, nice find. definitely frustrating to have a cross package failure like that.
On Tue, Mar 24, 2020, 00:40 StefanSalewski notifications@github.com wrote:
I think finally I found the cause for the real problem:
$ ls -lt /usr/x86_64-pc-linux-gnu/binutils-bin/lib/bfd-plugins total 8 lrwxrwxrwx 1 root root 60 Mar 10 16:29 liblto_plugin.so -> /usr/libexec/gcc/x86_64-pc-linux-gnu/10.0.1/liblto_plugin.so lrwxrwxrwx 1 root root 41 Mar 9 13:49 LLVMgold.so -> ../../../../lib/llvm/10/lib64/LLVMgold.so
So on 9 MAR an failed attempt to install clang10 created a link in /usr/x86_64-pc-linux-gnu/binutils-bin/lib/bfd-plugins to LLVMgold.so of version 10, which was not working. And all my tries to uninstall clang10 have not reset that link. I have now manually reset it to clang9, and now I was able to install glibc again, and I hope my whole box works again.
Maybe a reinstall of clang9 and llvm9 would have fixed that automatically?
I was not aware that clang can break gcc, I have considered both indepantly in the past.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/InBetweenNames/gentooLTO/issues/490#issuecomment-603078600, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAHXYRDR5AP4SOMMDYDYDDRJBPYFANCNFSM4LE3RGBQ .
Rebuilding llvm:10
without lto seems to have fixed the issue for me. Also, compiler-rt-sanitizers:10
won't build correctly with lto.
I'm also having the compiler-rt-sanitizers issue
On Wed, Mar 25, 2020, 14:40 Peter Levine notifications@github.com wrote:
Rebuilding llvm:10 without lto seems to have fixed the issue for me. Also, compiler-rt-sanitizers:10 won't build correctly with lto.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/InBetweenNames/gentooLTO/issues/490#issuecomment-604102305, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAHXYXB3FDE3VLVSOEUAN3RJJ24RANCNFSM4LE3RGBQ .
FYI: sys-devel/llvmgold
is what installs that symlink. I remember having a concern about that in the past, too, because you can have multiple clang
slots on your system but the LLVMgold.so
plugin will just use the highest available slot, no functionality to switch it around unlike eselect gcc
. And LLVM/Clang don't bother to guarantee ABI compatibility across different versions. Even LLVM IR itself is unstable between different LLVM versions -- learned that one the hard way.
Now, this issue seems to pertain to GCC 10 which I haven't tested outside of some sandboxes yet. GCC shouldn't be touching LLVMgold.so
regardless, as that was only really used for LTO with Clang before they switched to lld
. So, it sounds like before we migrate to GCC 10 we'll need to do some more extensive testing. I think GCC 10 is -fno-common
by default, for example, and that could induce a lot of breakage. Lets leave this issue open so we can refer back to it when GCC 10 reaches a stable release.
Just a note I ran into this with GCC 9.3.0 in my recent attempt to go back to LTO. I fixed it by removing sys-devel/llvmgold. I think firefox pulled it in at some point but maybe no long requires it?
I'm running into this issue as well, but I was able to bypass it by removing AMDGPU from LLVM_TARGETS and re-building llvm/clang-10 using gcc.
I found that building llvm/clang-10 with gcc, and with LLVM_TARGETS=AMDGPU, causes
Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be
registered!
If I compile llvm/clang-10 with clang LLVM_TARGETS=AMDGPU will work, but causes other issues like www-client/firefox[pgo,lto,clang] having pgo profile merging failures.
Also, www-client/firefox[clang,lto] depends on llvmgold for some reason even though it uses lld for linking.
For me all issues were fixed by installing =sys-devel/binutils-2.34-r1 (currently not keyworded)
That said, I still cannot properly compile Firefox but I don't think that is an LTO issue. Firefox compiles without pgo/clang but Segfaults immediately at runtime. With clang it does not even build. For now I am on firefox-bin and I'll try again after awhile. At least the rest of the system builds correctly with the latest binutils.
I fixed this problem by unmerging llvm and then rebuilding it. It didn't fix the segfault compiling compiler-rt-sanitizers though.
@Hello71 I think disabling LTO for llvm-10 fixes it.
Although, you can also use clang to compile llvm-10 which won't segfault with LTO, but will cause issues with pgo when compiling firefox. (At least this is what's happening on my system.)
sure, but this way you can keep lto.
If I am not mistaken, "[…] you can also use clang to compile llvm-10 which won't segfault with LTO, but will cause issues with pgo when compiling firefox. […]" also means that you keep link-time optimization, but through compiling this package with Clang instead of GCC.
From what I see, it also trades the segmentation fault when compiling compiler-rt-sanitizers
in for being unable to compile Firefox with profile-guided optimizations.
@elsandosgrande I should of mentioned that i am using clang to compile firefox, since I think the pgo and lto useflags on firefox are not compatible without the clang useflag.
I'll see if the clang pgo problem also affects other packages, like python.
So I re-emerged llvm-10, and clang-10, using clang as the compiler using this inside of my `/etc/portage/env'. (I also disable ccache on all packages that use clang due to Gentoo bug 709454)
USE="clang"
CC="clang"
CXX="clang++"
CFLAGS="${CFLAGS} -fno-math-errno -fno-trapping-math -flto=thin"
CXXFLAGS="${CXXFLAGS} -fno-math-errno -fno-trapping-math -flto=thin"
LDFLAGS="-Wl,--lto-O2 -Wl,-O2 -Wl,--as-needed -fuse-ld=lld"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"
NOLDADD=1
USE_NONGNU=1
I was able to successfully emerge dev-lang/python:3.7 with pgo, and clang, using those same environment variables. I'm in the process of re-emerging firefox to see my issue got cleared up.
UPDATE: Firefox failed to build. Here's the build.log: build.log.tar.gz
I have re-emerged llvm-10 using clang as compiler and without LTO using these environment variables.
USE="clang"
CC="clang"
CXX="clang++"
CFLAGS="${CFLAGS} -fno-math-errno -fno-trapping-math"
CXXFLAGS="${CXXFLAGS} -fno-math-errno -fno-trapping-math"
LDFLAGS="-Wl,-O2 -Wl,--as-needed -fuse-ld=lld"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"
NOLDADD=1
USE_NONGNU=1
I'm currently waiting for Firefox to finish re-emerging which has been going for about 2 and a half hours. Normally it will fail before the first hour, so this is a good sign.
UPDATE: Yup, Firefox successfully emerged with the useflags lto,pgo,clang. At this point I feel like I should make a new issue for this.
I've disabled -fipa-pta
from llvm-10 and ar
is no longer crashing loading the LLVMgold.so plugin.
I stepped right into this. And I can’t rebuild the LLVM or Clang with Clang with said env variables, because even the Clang is broken.
Is there any hope for my system yet, or is it time to format?
BorisCarvajal , your hint works great for me.
My box recently tried to update to clang 10, and I got the error "Two passes with the same argument (-amdgpu-argument-reg-usage-info)" when building some tools like compiler-rt. Your tip fixed it:
$ grep llvm /etc/portage/package.cflags/ltoworkarounds.conf
sys-devel/llvm *FLAGS-="-fipa-pta"
Then rebuild llvm and after that the other packages like compiler-rt.
Althorion, in Mar my box was also totally broken, gcc and clang refuses to work. But I got gcc to work again by manually fixing some links as described at the top of this thread.
@Althorion same as you. Can't rebuild clang or llvm at the moment. Did you end up getting it sorted? I'll give moving links around a go.
@telans unfortunately no. I’ve been trying quite a lot of things and ended up with a system so broken, it couldn’t even shut down, so I saved my @world
set, blasted the whole thing and build it anew.
For me as least, all that was needed was emerge -C llvmgold
& rebuilding llvm without -fipa-pta
. Llvm pulls llvmgold
back in after merging
Six weeks ago I cloned my harddisk partition and started using GentooLTO and gcc10, gcc compiled with lto and pgo.
Until yesterday it was working not bad, but now emerge of basic packages like libinput or dev-libs/ico fail with messages like
The core message is "Two passes with the same argument (-amdgpu-argument-reg-usage-info) attempted to be registered!" and is generated by ar tool.
Ar is from binutils, and installing a different binutils version fails with the same message.
I tried switching back to gcc 9.2, but I got the same issue.
Currently I have no idea about the cause of the problem. May it be the ar program itself? I did
Current binutils is
I can not use a copy of arm on my original partition, as that is version 2.32 and it wants to load a matching lib.
But maybe the cause of the problem is not ar at all. I may switch back to gcc 9.2 and try to emerge all the packages which I emerge in the last weeks without LTO, maybe that will help.