Open davide-q opened 4 months ago
@llvm/issue-subscribers-lldb
Author: None (davide-q)
Just FYI, in my environment the problem has not changed with rc2
I wonder if https://github.com/llvm/llvm-project/issues/25067 has been inadvertently re-introduced?
I think so.
The original terminfo fix from https://github.com/llvm/llvm-project/pull/69458 was removed by https://github.com/llvm/llvm-project/pull/92865 which removed terminfo as a dependency.
The original fix is mentioned in https://github.com/spack/spack/issues/36954 where they are missing the same symbol you are.
And someone from there turned up on the removal PR: https://github.com/llvm/llvm-project/pull/92865#issuecomment-2261438018
Perhaps this is because on a system where ncurses and terminfo are combined, removing terminfo is ok because the symbols come from the combined ncurses anyway. When they are split into two, we still have to link terminfo as well to be able to use ncurses.
What is your environment and does it have this split? Some of the commands shown in https://github.com/llvm/llvm-project/issues/25067#issue-1076221744 can confirm that, I presume.
@JDevlieghere also commented there but didn't want to commit something without trying it, so if we can run your environment in a container we can figure it out.
Thanks @DavidSpickett
My environment is a bare-metal RHEL 8.8 installation in which I am attempting (as I always have) a spack-based install of LLVM, so all of your sleuthing is spot-on! I don't use containers, so I can't help with that (unless I make a massive effort), but perhaps you would be able to reproduce the issue with spack which is a git-clone and a couple of spack bootstrap commands. Let me know if you would like me to list a step-by-step on how to reproduce it in that context.
The llvm-19.1.0-rc3 version has still this issue (and a couple more, one of which is #106549 )
Nothing has changed with RC4, I updated the name to reflect that.
I hope LLVM is not going to release 19.1 with this building bug.
So disappointing to see the release happening despite this serious problem
LLVM releases follow a set schedule and do not generally wait for specific fixes.
I would think the quickest path here would be for someone with access to the environment in question to prepare a fix.
Let me know if you would like me to list a step-by-step on how to reproduce it in that context.
Yes please, I have zero experience with spack. I think I can use one of the images described in https://developers.redhat.com/blog/2020/03/24/red-hat-universal-base-images-for-docker-users# to reproduce it.
Surprised that this isn't an issue for rhel without spack, but if spack is a full stack rebuild maybe it configures the libraries differently to the base system.
There were various symbol refactors between curses and terminfo in the past where it turned out that many distros distributed essentially broken variants of these libraries for quite some time which IIRC started getting exposed when lld started erroring on misconfigured linker scripts by default. I suspect that you're hitting one of these cases where the rhel libs are broken and now that the llvm link lines are no longer overspecified (by the implicit dep from llvm that was removed in 19.1) it exposes this issue.
So this is not an LLVM building bug, but a bug in rhel's distributed variants of curses and terminfo. You'll most likely be able to fix the issue by using a more recent version of curses and/or terminfo.
Thanks a lot for picking this up @DavidSpickett
Surprised that this isn't an issue for rhel without spack, but if spack is a full stack rebuild maybe it configures the libraries differently to the base system.
Yes, spack tries hard to abstract everything out from the underlying OS and builds an environment which is (as far as possible) independent from where it is running on (and in a way that is modular and allows cherrypicking of dependencies, libraries etc).
Because of that, I suspect the same identical problem would happen in a different OS, therefore if you have easier access to Ubuntu or whatever, you might try that first.
Yes please, I have zero experience with spack.
git clone https://github.com/spack/spack.git xxx
cd xxx
# optionally git checkout releases/v0.22.1-a # or whatever
source share/spack/setup-env.sh
spack compiler list # should return a compiler otherwise you need to add one to your base OS
Then edit var/spack/repos/builtin/packages/llvm/package.py
to add the latest version of LLVM, as follows
--- a/var/spack/repos/builtin/packages/llvm/package.py
+++ b/var/spack/repos/builtin/packages/llvm/package.py
@@ -36,7 +36,12 @@ class Llvm(CMakePackage, CudaPackage, CompilerPackage):
license("Apache-2.0")
version("main", branch="main")
+ version("19.1.0", sha256="0a08341036ca99a106786f50f9c5cb3fbe458b3b74cab6089fd368d0edb2edfe")
version("18.1.3", sha256="fc5a2fd176d73ceb17f4e522f8fe96d8dde23300b8c233476d3609f55d995a7a")
version("18.1.2", sha256="8d686d5ece6f12b09985cb382a3a530dc06bb6e7eb907f57c7f8bf2d868ebb0b")
version("18.1.1", sha256="62439f733311869dbbaf704ce2e02141d2a07092d952fc87ef52d1d636a9b1e4")
version("18.1.0", sha256="eb18f65a68981e94ea1a5aae4f02321b17da9e99f76bfdb983b953f4ba2d3550")
Then run spack spec llvm@19.1.0
and see the list of dependencies that spack will compile and link against LLVM. If everything looks good, go ahed with compilation (see below) otherwise change what dependencies you want to add (+
) or remove (~
) for example spack spec llvm@19.1.0+cuda~clang
. In my case I am using the default with no modifiers which yields
llvm@19.1.0%gcc@13.2.0+clang~cuda~flang+gold~ipo+libomptarget~libomptarget_debug~link_llvm_dylib+lld+lldb+llvm_dylib+lua~mlir+polly~python~split_dwarf~z3~zstd build_system=cmake build_type=Release compiler-rt=runtime generator=ninja libcxx=runtime libunwind=runtime openmp=runtime shlib_symbol_version=none targets=all version_suffix=none
Then compile simply with spack compile XXX
where XXX is same as in the spec, most likely simply llvm@19.1.0
, or possibly with the modifiers of what you decided to add/remove, in my example llvm@19.1.0+cuda~clang
@aaronmondal thanks for your comment, this explains better why it happens (still not sure why "overspecifying the link line" fixes it and what's technically wrong with that overspecification, but I digress...)
It's definitely possible that the underlying libraries I am linking against are broken, and in such a case I can solve my problem by editing the spack package.py
file to include the new dependencies and compile them from source. For me to do so, LLVM should have told me so, right? In the old autoconf days this would/should have been caught when running configure which would/should try to compile and link a dummy test program against all the needed dependency. Seeing a failure at that configure time I would have understood there is a dependency which is not (correctly) satisfied -- or even better it could have been configured to compile without the feature
I am not familiar enough with cmake and the LLVM build system to say how exactly this should happen, but I still consider it a bug in the LLVM build system. The problem should be identified upfront during the equivalent of the configure step and not after an hour of compile time at the final link step. Don't you agree?
The patch in this comment avoids this issue https://github.com/llvm/llvm-project/issues/101368: https://github.com/spack/spack/pull/46504#issuecomment-2372520318
To also bring this bug up again, ChromeOS also was unable to build LLDB because of this breakage. We had to hack in the Spack patch ourselves: https://chromium-review.googlesource.com/c/external/github.com/llvm/llvm-project/+/5902466
This will likely be an issue for other distributions which don't keep the missing tinfo symbols inside the curses shared object.
I've stumbled upon this issue as well for LLVM 19.1.0+19.1.1 which can easily be reproduced with Ncurses 6.5 which got terminfo functions separated from ncurses by default. I've created this simple patch to resolve the issue for me without adding the removed LLVM_ENABLE_TERMINFO back. The key is that split Ncurses will have -ltinfo in the NCURSES_LIBRARIES. It will also not harm alternate curses implementations as it will be empty for those and thus will not affect the resulting LLDB_CURSES_LIBS array. For monolith Ncurses it will just result in double -lncurses, so no problem here either.
--- ./lldb/source/Core/CMakeLists.txt 2024-10-01 15:08:12.000000000 +0300
+++ ./lldb/source/Core/CMakeLists.txt 2024-10-04 01:13:09.558078374 +0300
@@ -10,7 +10,7 @@
set(LLDB_LIBEDIT_LIBS)
if (LLDB_ENABLE_CURSES)
- list(APPEND LLDB_CURSES_LIBS ${PANEL_LIBRARIES} ${CURSES_LIBRARIES})
+ list(APPEND LLDB_CURSES_LIBS ${PANEL_LIBRARIES} ${CURSES_LIBRARIES} ${NCURSES_LIBRARIES})
if (LLVM_BUILD_STATIC)
list(APPEND LLDB_CURSES_LIBS gpm)
endif()
Because of that, I suspect the same identical problem would happen in a different OS, therefore if you have easier access to Ubuntu or whatever, you might try that first.
I have been super busy but do intend to try this at some point, I realise the delay is annoying.
In the meantime if someone wants to propose the above change as a PR, that would also be welcome. In fact if it comes from someone who is directly affected by the problem, it will have more weight behind it.
I was able to reproduce this kind of, but I had to hack my CMake config manually and even then ended up with system ncurses also being referenced. This is no doubt a skill issue on my part but I'm not going to propose a change I can't verify either, especially when it is to a build system.
I've stumbled upon this issue as well for LLVM 19.1.0+19.1.1 which can easily be reproduced with Ncurses 6.5 which got terminfo functions separated from ncurses by default.
@vityank do you know how to properly configure CMake to use an ncurses that is not the system one?
Also please provide the configure command for ncurses. In my attempts I used:
$ ./configure --with-termlib --without-ada --with-shared --with-cxx-shared
Does that look correct?
I did setup Spack and built main
without any issue which either means I did something wrong or the included patch for 19 also applies to 20, but that's not what I understood from the code.
If you don't want to wait for me to figure out the non-spack build, confirm that the above fix (or the one already in spack for 19.x) works with main
and send a PR for it (https://llvm.org/docs/Contributing.html). I am much more able to check that the existing builds work, rather than reproducing this new (to us at least) build configuration.
This started in RC1 and I was hoping to get it fixed before it was released.
The failure is about linking Curses-related objects, spitting out many errors like
and many, many more
I wonder if #25067 has been inadvertently re-introduced?