easybuilders / easybuild

EasyBuild - building software with ease
http://easybuild.io
GNU General Public License v2.0
464 stars 143 forks source link

RFC: Implementing a proper Clang toolchain #640

Open geimer opened 4 years ago

geimer commented 4 years ago

Disclaimer: I'm by no means an LLVM or Clang expert. The information below is just a collection of bits and pieces found in various places as well as my personal thoughts on how EasyBuild support could be improved.


Target

A working LLVM-based toolchain -- at least for C/C++ -- with minimal redundancy. Here, "toolchain" is not meant in the EasyBuild sense (i.e., including an MPI, math libs, etc.), but merely refers to a compiler environment that can be used by end users to build their codes. (This doesn't rule out to have an MPI w/o Fortran support using Clang, though.)

With proper Fortran support being on the horizon, however, it might become a full toolchain in the EasyBuild sense in the future. This should be taken into account in the design.

Status quo

LLVM / Clang / flang

LLVM provides a framework for code optimization and generation for many different target CPUs. The most prominent language frontend is Clang, which focuses on C-like languages (C, C++, Objective-C, OpenCL). Basically all commercial compiler vendors (Intel, PGI, Cray, IBM, Fujitsu, ARM) have switched in the meanwhile to Clang as the basis for their C/C++ compilers.

Fortran support was started based on the PGI Fortran compiler frontend, see the flang project on GitHub, now called "old/legacy/classic flang". However, it requires patched versions of LLVM and Clang, and seems stuck at LLVM 9. However, this mailing list post suggests that there might be an update for LLVM 11 ("LLVM11 with classic flang is on various vendor's roadmap for this autumn, so one of us will do it I'm sure.")

Besides, there is a "new flang" frontend (formerly called f18) written from scratch, now developed as an official LLVM project. However, it isn't fully functional yet and still depends on another compiler to do the actual work, see this mailing list post.

EasyBuild

EasyBuild currently includes various LLVM packages which are used as dependencies by, for example, Mesa, numba, and Rust. Recent versions are built on top of GCCcore, and only include the core LLVM libraries and tools.

In addition, there are various Clang easyconfigs. Again, recent versions are (usually) built on top of GCCcore. These can be used as a stand-alone compiler, but are also used as dependencies by various packages, such as pocl, TRIQS, and Longshot, and could be used by additional packages such as Score-P and Doxygen. This is due to also providing libraries for source-code parsing and processing. The Clang packages build their own copy of LLVM, and include other LLVM projects such as an OpenMP runtime library, the lld linker, the libc++ C++ Standard Library, and the polly polyhedral optimizer, though not all of those components are used by default with the current configuration.

There has been some work on packaging "legacy flang" (see https://github.com/easybuilders/easybuild-easyconfigs/pull/8335 and https://github.com/easybuilders/easybuild-easyblocks/pull/1729), however, the question is whether it is worth putting more effort into this since things might change considerably with the "new flang".

Possible ways to organize things in EasyBuild

  1. Build full Clang (including lld, libraries, etc.) using an existing LLVM built with GCCcore as dependency

    • Pros:
      • Reduces redundancy
    • Cons:

      • Building LLVM projects out-of-tree is basically undocumented. Therefore, it is unclear how projects interrelate to each other and how to configure things correctly. However, some information could be extracted from the Fedora RPM specs (e.g., for Clang).
      • The LLVM OpenMP library by default installs symlinks for libgomp and libiomp5, i.e., the OpenMP runtimes of the GCC and Intel compilers, as it implements both APIs. Thus, the order in which modules are loaded determines which runtime is found by ld.so and affects the runtime behavior of codes using OpenMP.

      Creating these symlinks can be disabled via a CMake configuration option, but doing so may lead to simultaneously using two different OpenMP runtimes if some OpenMP code compiled with Clang is linked to a library built with GCCcore also using OpenMP.

      • Likewise, enabling libc++ by default for Clang is likely to make code incompatible with C++ libraries compiled with GCCcore using libstdc++.
    • Caveats:
      • According to the polly documentation, it should probably be built as part of LLVM rather than Clang.
  2. Introduce a new package named, e.g., LLVM-Clang built with GCCcore providing a full Clang (including lld, libraries, etc.) and use it as a dependency for all packages that currently depend on either LLVM or Clang. A Clang compiler package would then be a bundle of GCCcore, LLVM-Clang, and binutils.

    • Pros:
      • Reduces redundancy
      • Follows the documented way of building all LLVM projects in one go
    • Cons:
      • Inherits the OpenMP runtime and libc++ issues outlined above
      • The LLVM-Clang vs. Clang packaging would probably cause questions similar to the GCCcore vs. GCC separation.
  3. Build minimal Clang (excluding lld, libraries, OpenMP runtime) on top of GCCcore -- either using an existing LLVM or as part of a LLVM-Clang package as outlined above -- to provide the Clang libraries to packages that need it as a dependency. In addition, build a full LLVM/Clang (including everything) on the SYSTEM level as a separate toolchain.

    • Pros:
      • Clear separation
      • Follows the documented way of building all LLVM projects in one go
      • The full toolchain aspects to consider are documented
    • Cons:
      • Requires duplication of everything built with GCCcore, as it is a completely separate toolchain.
    • Caveats:
      • Building Clang usually requires a "modern" host compiler (found Clang >=3.5 or GCC >=5.1 to be documented as a requirement), i.e., on "Enterprise" Linux distros shipping ancient compilers one needs to first build another compiler for bootstrapping Clang on the SYSTEM level. (How does one properly do this? Use GCC as a builddep rather than toolchain???)
      • It is unclear whether gfortran could (temporarily) serve as a Fortran compiler in a full LLVM toolchain using, e.g., compiler-rt instead of libgcc_s. It's very likely that this won't work.
      • The Clang module under GCCcore serves a very limited purpose and should thus be avoided by end-users, unless they really know what they are doing. Not sure how to best prevent/document this. It is also unclear whether such a stripped down Clang would be sufficient for all packages that currently depend on the existing Clang packages.
ocaisa commented 3 months ago

@geimer I'm looking into this again now from the position that flang-new is at least good enough to get started on.

Looking at your first point, how about we do everything except the front end compilers Clang and Flang in an LLVM build, and then couple these two into a new LLVM-compilers as a toolchain. I would hope that building the frontends based on an existing installation should be do-able, looking at their docs:

cmake -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_BUILD_TYPE=Release -G "Unix Makefiles" ../llvm
make
Note: For subsequent Clang development, you can just run make clang.

so I wonder it it might get away with just not running the make step and jumping straight to make clang.

As regards OpenMP, we could disable the creation of the symlinks in the LLVM build, and create the appropriate symlink with the LLVM-compilers build (which I guess would be just the libgomp one). For libstdc++, we will likely have to make this the default...not sure what the performance implications of that would be.

Now that I see all that written down, I wonder if it doesn't just look like your second solution, just with LLVM and LLVM-compilers instead of LLVM-Clang and Clang. Indeed, is it really a problem that LLVM ships the compilers? They only have meaning for us when used in a toolchain. We can use LLVM-compilers to restore the OpenMP symlink based on whether we want the Intel or GCC versions.

ocaisa commented 3 months ago

Hmm, the issue with OpenMP and GCCcore is actually really a general one, we can probably side-step that whole thing by making libgomp a banned library when using GCCcore (https://github.com/easybuilders/easybuild-framework/issues/4535)

ocaisa commented 3 months ago

Initial indications are that we should be able to separate out clang/flang as @Crivella has found that their make files indicate support for this:

# If we are not building as a part of LLVM, build Clang as an
# standalone project, using LLVM as an external library:
if(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
  project(Clang)
  set(CLANG_BUILT_STANDALONE TRUE)
endif()
# Check for a standalone build and configure as appropriate from
# there.
if (CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
  message("Building Flang as a standalone project.")
  project(Flang)
  set(FLANG_STANDALONE_BUILD ON)
else()
  set(FLANG_STANDALONE_BUILD OFF)
endif()
Thyre commented 2 months ago

@geimer I'm looking into this again now from the position that flang-new is at least good enough to get started on.

I agree that the Fortran compiler is slowly approaching a usable state. With LLVM 18, there are still very simple OpenMP tests which fail to compile, but those are fixed in the current trunk version. This means that LLVM 19 might be a version where one can realistically try to build a toolchain.


Looking at your first point, how about we do everything except the front end compilers Clang and Flang in an LLVM build, and then couple these two into a new LLVM-compilers as a toolchain. I would hope that building the frontends based on an existing installation should be do-able, looking at their docs:

Just to understand correctly, the idea would be to build OpenMP and so on with GCC first and afterwards build the Clang & Flang with this LLVM? We could do this, but should only build the minimal set of things we need for this to work or else users may miss features. The most important one I can think of right now would be support for OpenMP offloading, which may not work if OpenMP is built with GCC.

See (source):

Note: The compiler that generates the offload code should be the same (version) as the compiler that builds the OpenMP device runtimes. The OpenMP host runtime can be built by a different compiler.


As regards OpenMP, we could disable the creation of the symlinks in the LLVM build, and create the appropriate symlink with the LLVM-compilers build (which I guess would be just the libgomp one). For libstdc++, we will likely have to make this the default...not sure what the performance implications of that would be.

We would not only need to take libstdc++ into consideration but also libc. Both are available in the LLVM repo and libc might get more important in the future with continued work on the offloading infrastructure. As for the symlink, I would advocate against only offering libgomp as this may leave out the LLVM OpenMP runtime, which means no offloading and no OpenMP Tools Interface. The last one would be an issue for performance tools like Score-P. We need to make sure that -fopenmp/-fopenmp=libomp work correctly.

See this example with Clang/trunk:

Click to open ```console $ clang --version clang version 19.0.0git (https://github.com/llvm/llvm-project.git 0f323dc0c43bd45147bdf8ee9cbeef0d8f57165b) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/apps/software/Clang/trunk/bin Build config: +assertions $ clang -fopenmp=libgomp ompt_tool.c test.c -I$(pwd)/.. ompt_tool.c:196:14: warning: enumeration values 'ompt_dependence_type_out_all_memory' and 'ompt_dependence_type_inout_all_memory' not handled in switch [-Wswitch] 196 | switch ( t ) | ^ ompt_tool.c:1014:14: warning: enumeration value 'ompt_work_loop_static' not handled in switch [-Wswitch] 1014 | switch ( t ) | ^ ompt_tool.c:1168:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat] 1165 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING: thread_begin_cb not dispatched; thread_data->value = %" PRId32 " (supposed to be >= 1)\n", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1166 | __FUNCTION__, | ~~~~~~~~~~~~~ 1167 | ompt_tool_tid, | ~~~~~~~~~~~~~~ 1168 | thread_data->value ); | ^~~~~~~~~~~~~~~~~~~~ ./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF' 81 | printf( __VA_ARGS__ ); \ | ^~~~~~~~~~~ ompt_tool.c:1176:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat] 1172 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING tid != thread_data->value (%" PRId32 " != %" PRId32 ")\n", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1173 | __FUNCTION__, | ~~~~~~~~~~~~~ 1174 | ompt_tool_tid, | ~~~~~~~~~~~~~~ 1175 | ompt_tool_tid, | ~~~~~~~~~~~~~~ 1176 | thread_data->value ); | ^~~~~~~~~~~~~~~~~~~~ ./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF' 81 | printf( __VA_ARGS__ ); \ | ^~~~~~~~~~~ 4 warnings generated. $ ./a.out $ clang -fopenmp=libomp ompt_tool.c test.c -I$(pwd)/.. ompt_tool.c:196:14: warning: enumeration values 'ompt_dependence_type_out_all_memory' and 'ompt_dependence_type_inout_all_memory' not handled in switch [-Wswitch] 196 | switch ( t ) | ^ ompt_tool.c:1014:14: warning: enumeration value 'ompt_work_loop_static' not handled in switch [-Wswitch] 1014 | switch ( t ) | ^ ompt_tool.c:1168:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat] 1165 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING: thread_begin_cb not dispatched; thread_data->value = %" PRId32 " (supposed to be >= 1)\n", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1166 | __FUNCTION__, | ~~~~~~~~~~~~~ 1167 | ompt_tool_tid, | ~~~~~~~~~~~~~~ 1168 | thread_data->value ); | ^~~~~~~~~~~~~~~~~~~~ ./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF' 81 | printf( __VA_ARGS__ ); \ | ^~~~~~~~~~~ ompt_tool.c:1176:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat] 1172 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING tid != thread_data->value (%" PRId32 " != %" PRId32 ")\n", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1173 | __FUNCTION__, | ~~~~~~~~~~~~~ 1174 | ompt_tool_tid, | ~~~~~~~~~~~~~~ 1175 | ompt_tool_tid, | ~~~~~~~~~~~~~~ 1176 | thread_data->value ); | ^~~~~~~~~~~~~~~~~~~~ ./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF' 81 | printf( __VA_ARGS__ ); \ | ^~~~~~~~~~~ 4 warnings generated. $ OMP_NUM_THREADS=1 ./a.out [ompt_start_tool] tid = -1 | omp_version 201611 | runtime_version = 'LLVM OMP version: 5.0.20140926' [my_initialize_tool] tid = -1 | initial_device_num 0 [register_callbacks] tid = -1 | thread_begin = always (expect always) [register_callbacks] tid = -1 | thread_end = always (expect always) [register_callbacks] tid = -1 | parallel_begin = always (expect always) [register_callbacks] tid = -1 | parallel_end = always (expect always) [register_callbacks] tid = -1 | task_create = always (expect always) [register_callbacks] tid = -1 | task_schedule = always (expect always) [register_callbacks] tid = -1 | implicit_task = always (expect always) [register_callbacks] tid = -1 | target = always (expect always) [register_callbacks] tid = -1 | target_emi = always (expect always) [register_callbacks] tid = -1 | target_data_op = always (expect always) [register_callbacks] tid = -1 | target_data_op_emi = always (expect always) [register_callbacks] tid = -1 | target_submit = always (expect always) [register_callbacks] tid = -1 | target_submit_emi = always (expect always) [register_callbacks] tid = -1 | control_tool = always (expect always) [register_callbacks] tid = -1 | device_initialize = always (expect always) [register_callbacks] tid = -1 | device_finalize = always (expect always) [register_callbacks] tid = -1 | device_load = always (expect always) [register_callbacks] tid = -1 | device_unload = never (expect always) [register_callbacks] tid = -1 | sync_region_wait = always [register_callbacks] tid = -1 | mutex_released = always [register_callbacks] tid = -1 | dependences = always [register_callbacks] tid = -1 | task_dependence = always [register_callbacks] tid = -1 | work = always [register_callbacks] tid = -1 | masked = always [register_callbacks] tid = -1 | target_map = never [register_callbacks] tid = -1 | target_map_emi = never [register_callbacks] tid = -1 | sync_region = always [register_callbacks] tid = -1 | reduction = always [register_callbacks] tid = -1 | lock_init = always [register_callbacks] tid = -1 | lock_destroy = always [register_callbacks] tid = -1 | mutex_acquire = always [register_callbacks] tid = -1 | mutex_acquired = always [register_callbacks] tid = -1 | nest_lock = always [register_callbacks] tid = -1 | flush = always [register_callbacks] tid = -1 | cancel = always [register_callbacks] tid = -1 | dispatch = always [register_callbacks] tid = -1 | error = always [thread_begin_cb] tid = 1 | type = initial [implicit_task_cb] tid = 1 | parallel_data = 0 | task_data = 6660001 | endpoint = begin | actual_parallelism = 1 | index = 1 | flags = initial [parallel_begin_cb] tid = 1 | parallel_data = 7770001 | encountering_task_data = 6660001 | flags = invoker_runtime_team | requested_parallelism = 1 | codeptr_ra = 0x59fcfcd069bb [implicit_task_cb] tid = 1 | parallel_data = 7770001 | task_data = 6660002 | endpoint = begin | actual_parallelism = 1 | index = 0 | flags = implicit [implicit_task_cb] tid = 1 | parallel_data = 7777777 | task_data = 6660002 | endpoint = end | actual_parallelism = 1 | index = 0 | flags = implicit [parallel_end_cb] tid = 1 | parallel_data = 7770001 | encountering_task_data = 6660001 | flags = invoker_runtime_team | codeptr_ra = 0x59fcfcd069bb [implicit_task_cb] tid = 1 | parallel_data = 0 | task_data = 6660001 | endpoint = end | actual_parallelism = 0 | index = 1 | flags = initial [thread_end_cb] tid = 1 [my_finalize_tool] tid = 1 ```
hattom commented 2 months ago

Maybe it would be not unhelpful if I provide an atypical(?) perspective for why I'm interested in a[1] Flang/Clang toolchain, in fact why I'm interested even if the compiler is not bug-free.

Since I'm developing and maintaining a Fortran application, and one which also offloads to GPU, I want to be able to test the compilers (also without offloading), and find out if I need to i) workaround compiler issues, or ii) ask a vendor to prioritize a feature for us. At the moment, it's kind of tricky to get an environment in which I can test the code, so I can't tell e.g. AMD if I can even get ready to start trying to port the offloading (e.g. if the compiler can compile our code on CPU). This environment for testing is what interests me in a Flang/Clang toolchain. For this, I need at least Compiler, MPI, HDF5, LAPACK. I think everything else I could deal with later.

[1] actually, I'd like to have ~3 toolchains, but that's maybe getting a bit far ahead of ourselves: upstream flang-new, upstream flang-classic, and vendored (e.g.) amd-flang

Thyre commented 2 months ago

Maybe it would be not unhelpful if I provide an atypical(?) perspective for why I'm interested in a[1] Flang/Clang toolchain, in fact why I'm interested even if the compiler is not bug-free.

I'm absolutely with you on this one. The Fortran compiler is getting mature enough that building a toolchain is feasible and, even if not entirely bug-free, might have a large interest for users.

Since I'm developing and maintaining a Fortran application, and one which also offloads to GPU, I want to be able to test the compilers (also without offloading), and find out if I need to i) workaround compiler issues, or ii) ask a vendor to prioritize a feature for us.

While I agree that this is a very interesting scenario, having an up-to-date latest and greatest version available all the time might be difficult, especially if a whole toolchain is built with that compiler. That's one reason why I chose to build a very small toolchain manually (basically only LLVM/Clang + OpenMPI), even if it's more painful to do so. This allows me to test a daily Clang (and sometimes also AOMP) build for issues.

[1] actually, I'd like to have ~3 toolchains, but that's maybe getting a bit far ahead of ourselves:

upstream flang-new, upstream flang-classic, and vendored (e.g.) amd-flang

I would guess that flang-classic will fade out once flang-new is ready. Building amdflang is more complicated though, as you would probably also want to have their entire LLVM toolchain. At that point, you're basically building AOMP (or the equivalent ROCm components) from source. That's possible, but as you said, we should focus on LLVM/Clang first.

ocaisa commented 2 months ago

I'm absolutely with you on this one. The Fortran compiler is getting mature enough that building a toolchain is feasible and, even if not entirely bug-free, might have a large interest for users.

I think this is the general idea here, we move ahead in the C/C++ space and see where things break with Fortran, as things improve we will get more and more Fortran applications building. I'm particularly interested in pushing some Fortran applications we are connected to to start working on OpenMP device offloading.

While I agree that this is a very interesting scenario, having an up-to-date latest and greatest version available all the time might be difficult, especially if a whole toolchain is built with that compiler. That's one reason why I chose to build a very small toolchain manually (basically only LLVM/Clang + OpenMPI), even if it's more painful to do so. This allows me to test a daily Clang (and sometimes also AOMP) build for issues.

So, there is a subtle issue here. EasyBuild itself updates toolchains twice a year (e.g., 2023a, 2023b). That means that you can expect to find a reasonable amount of software for those releases and we would try harder to support them. That doesn't mean we couldn't have additional releases of Clang and an associated toolchain (these are usually versioned 2023.11, etc.), just that only 2 per year would become a "supported" toolchain. How you can work with this depends on your use case. If you are not bumping your dependency versions (outside of a EB toolchain release) then --try-toolchain is your friend. Of course the benefit of working together is we can continuously improve the relevant easyblock to ensure we are always current with our support and making good choices. This won't cover daily builds, but at least the easyblock will work out of the box.

[1] actually, I'd like to have ~3 toolchains, but that's maybe getting a bit far ahead of ourselves: upstream flang-new, upstream flang-classic, and vendored (e.g.) amd-flang

I would guess that flang-classic will fade out once flang-new is ready. Building amdflang is more complicated though, as you would probably also want to have their entire LLVM toolchain. At that point, you're basically building AOMP (or the equivalent ROCm components) from source. That's possible, but as you said, we should focus on LLVM/Clang first.

I don't think we would be considering flang-classic, I'd prefer to look forward given the offloading support. AOMP is going to happen, there is a WIP in progress PR and we need this anyway for ROCm.

Crivella commented 2 months ago

For this, I need at least Compiler, MPI, HDF5, LAPACK. I think everything else I could deal with later.

I've been working on this (still on making sure all components of llvm-project works properly also with a bootstrapped build to remove GCC dependencies). With a simple build of flang-new ii did try to compile QuantumESPRESSO but it did fail on the FFTXlib due to a lack of support for polymorphism.

Will have to try with HDF5 and LAPACK

Thyre commented 2 months ago

So, there is a subtle issue here. EasyBuild itself updates toolchains twice a year (e.g., 2023a, 2023b). That means that you can expect to find a reasonable amount of software for those releases and we would try harder to support them. That doesn't mean we couldn't have additional releases of Clang and an associated toolchain (these are usually versioned 2023.11, etc.), just that only 2 per year would become a "supported" toolchain. How you can work with this depends on your use case. If you are not bumping your dependency versions (outside of a EB toolchain release) then --try-toolchain is your friend. Of course the benefit of working together is we can continuously improve the relevant easyblock to ensure we are always current with our support and making good choices. This won't cover daily builds, but at least the easyblock will work out of the box.

Looking at the current EasyConfigs, there seems to be one version per toolchain update (with Clang 16 being the exception), normally using the last release in the LLVM/Clang update cycle. From my perspective, this is sufficient for EasyBuild and its users.

Like you've said, additional versions can easily be added and if a major version is supported already (e.g. 18.1.0), the chance is high that another version (e.g. 18.1.7) also works fine when passed via --try-toolchain.

I've been working on this (still on making sure all components of llvm-project works properly also with a bootstrapped build to remove GCC dependencies).

That sounds great! There are certainly some quirks that can come up. Just as an example: We're developing an LLVM IR plug-in for our application that will be used as an additional pass when a user compiles his application. There, we want to use llvm::demangle. However, the Clang installation on our HPC system fails to link llvm::demangle even though LLVMDemangle.a is linked. One needs to link libclang.so, which is not provided by llvm-config.

Crivella commented 2 months ago

To have a point of reference i've just opened 2 PRs for EB and related EC files

there are still some things that need fixing/improving, but in the meanwhile suggestion/comments are welcome

ocaisa commented 2 months ago

We should see LLVM 19 in September: https://discourse.llvm.org/t/llvm-19-release-schedule-and-planning/79828