Closed ian-ross closed 9 months ago
Thanks for this @ian-ross ! I'm giving it a shot now. Do you known exactly what the minimum dependencies are? I'm having trouble somewhere during the boost build on a Linux cluster with GNU compiler suite 8.5.0.
EDIT: To be clear I do have cmake, curl, zip, unzip, tar, and pkg-config installed, but I'm getting a barrage of errors this:
cc1plus: warning: /rds/general/user/seastham/home/simulations/C2024-01-06_APCEMMTest/Code.APCEMM_vcpkg/Code.v05-00/submodules/vcpkg/buildtrees/boost-math/x64-linux-dbg/boost/build/447e539c72b8d8a1c8c92bdc86e12da7/pch.gch: had text segment at different address
cc1plus: error: one or more PCH files were found, but they were invalid
I don't know what the minimum GCC version is, but it builds on Hex with g++ 9.4.0.
Is this as part of installing Boost, or the actual APCEMM build?
I think you need to figure out how to disable the use of precompiled headers -- the error looks suspiciously like the one reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1806545
One thing I learnt this morning is that setting CMAKE_CXX_STANDARD
doesn't actually make CMake require that standard, which is a bit confusing. I think there should be an additional line in the top-level CMakeLists.txt
:
set(CMAKE_CXX_STANDARD_REQUIRED ON)
But that will cause CMake to fail with old compilers...
And I think that error might be from when vcpkg is installing the Boost libraries. But I've never tried to build this with an older compiler.
OK - I've at least tried with a newer version of gcc (10.2.0) but still no joy, so it's at least not that. The final errors are
CMake Error: CMake was unable to find a build program corresponding to "Unix Makefiles". CMAKE_MAKE_PROGRAM is not set. You probably need to select a different build tool.
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
but I'm not certain that this is really the fundamental issue? With regards to precompiled headers I did delete ~/.vcpkg/*
before this.
I am so far from being an expert with CMake... I have seen that error before, but I can't remember what was causing it. First stupid questions: Do you have a working version of make
? What does make --version
say? And do you have a working version of g++? What does g++ --version
say? (Just to rule out any silly "installed but can't find it" problems.) And finally, what version of CMake do you have? What does cmake --version
say?
Also, if you want to zip up the whole of the build output and put it somewhere I can download it, I'll try to figure out what's going on.
Looks like make
, cmake
, and g++
are functioning (fully agreed these were reasonable questions though - I've been caught out by much simpler things before!). Nonetheless, the versions are:
make 4.2.1
g++ 10.2.0
cmake 3.20.2
Unfortunately the build directory is too big for GitHub, but for now you can get a zipped tarball (57 MB) here: https://www.dropbox.com/scl/fi/6b0tsxz688xtwax0mp740/build.tar.gz?rlkey=qtle0h2q984q234pd3a3qz0y4&dl=0
I'll take a look!
It's failing to build OpenSSL, which is needed for one of the Boost packages. What sort of Linux are you on? Can you show me what's in /etc/issue
? Then I can set up a VM with that exact Linux version to figure out what's wrong. I'm guessing it's Ubuntu rather than Debian, because one of the things that seems to be missing is the Linux kernel headers, and I think Debian installs those by default. The Ubuntu package you need for that is linux-libc-dev
apparently. But I don't know why it worked fine in an Ubuntu VM I set up if that's the case. That was Ubuntu 22.04 which is pretty recent though. The other thing is some random Perl module. You might need to install perl-core
or perl5-core
, but I'm not sure about that.
In any case, tell me the Linux distribution you're using and I'll see if I can reproduce the problem in a VM. I think that's easier than me trying to tell you to install random operating system packages until things work.
(I have to say that I'm not too impressed with the "reproducible build" part of CMake and vcpkg here. Maybe I've just not set it up right.)
/etc/issue
was unfortunately a bit unhelpful but here is the content of /etc/os-release
:
NAME="Red Hat Enterprise Linux"
VERSION="8.5 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.5"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.5 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.5
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.5"
So - neither Ubuntu nor Debian I'm afraid! I do have access to some versions of perl through the module loader, at least.
Oh lord. I'll see what I can do.
Thanks @ian-ross ! Given that this IS Red Hat though, I'm wondering if the issue raised by @speth (https://bugzilla.redhat.com/show_bug.cgi?id=1806545) may indeed be at the root of all this given the complaints about PCHs?
I'm curious which package is dragging in curl
as a dependency -- that's the start of this chain of problems with trying to build OpenSSL and struggling with perl modules and needing kernel headers. Knowing nothing about vcpkg, is there any way to tell it to just use an existing install of curl (which I assume is already present on the system)?
I think it's boost-build
that uses OpenSSL. I don't know if that's even needed (maybe if it's used for building other Boost packages?). It might be possible to restrict the Boost packages that are installed to just those that are needed. I'm waiting for the RHEL 8.5 installation media to download now so I can set up a VM to figure out what's going on.
This is turning out to be less than enjoyable. I'll give it a little more time tomorrow, but I think I might have to say that you're on your own on Red Hat. I've set up a VM, but you can't install software without registering with Red Hat, and the registration process wasn't working at all: I'd set up an account with Red Hat, I'd set up a developer subscription, but when I tried to log in to the account from Red Hat subscription manager, it just said "Name or service unknown". After messing around for ages with trying loads of stupid things, I though to check the network connection... The Red Hat installer doesn't enable the network, unlike every other Linux installer, so nothing works to start with. I've now got it set up so I can install software, so I'll see what I can do tomorrow.
I'm curious which package is dragging in
curl
as a dependency -- that's the start of this chain of problems with trying to build OpenSSL and struggling with perl modules and needing kernel headers. Knowing nothing about vcpkg, is there any way to tell it to just use an existing install of curl (which I assume is already present on the system)?
It's the NetCDF libraries. It ought to be possible to make them use the system libraries instead of building their own, but I failed to get that to work.
After spending all morning on this, I came up with the following recipe. This works in a fresh RHEL 8.5 virtual machine, but I can't make any guarantees about what happens on your cluster.
# Install some prerequisites: I don't know if all of these are needed, but
# they won't hurt.
sudo yum install git tar bzip2 kernel-headers perl gcc make
# Install and enable GCC 10 toolchain.
sudo yum install gcc-toolset-10
# This is needed in any shell where you want to use g++ 10. It sets up the
# paths to use the right compilers.
source scl_source enable gcc-toolset-10
# Install other prerequisites (all the others seem to already be installed).
sudo yum install cmake python3.11
# Make python3 mean Python 3.11.
mkdir -p ~/.local/bin
ln -s $(which python3.11) ~/.local/bin/python3
# Clone the repo and check out the vcpkg branch.
git clone https://github.com/ian-ross/APCEMM.git
cd APCEMM
git checkout vcpkg-dependencies
git submodule update --init --recursive
# Build.
mkdir build
cd build
cmake ../Code.v05-00
cmake --build .
The curl
stuff was a bit of a red herring, I think. The main problem is that one of the dependencies used by NetCDF is a package configuration tool that uses Meson as its build system. And Meson relies on Python >= 3.7. RHEL 8.5 has Python 3.6 (which went end-of-life in 2021, so is no longer supported by a lot of tools) so it didn't work. The solution is to install Python 3.11 and add a link in ~/.local/bin
(which should be in your path ahead of /usr/bin
) so that python3
means Python 3.11.
Apart from that and making sure there's a good g++ version, I think that's it. I'm going to do one more test in a fresh VM, but I think this should work now.
However, it's really hard to support software for obsolete systems like this. "Obsolete" might sound a bit strong, but Python 3.6 has been in end-of-life since 2017! I think it would be useful to set a range of tool versions that APCEMM requires, otherwise every single time someone tries to install it they're going to go through this sort of pain. The vcpkg
thing handles direct and transitive dependencies quite well up to a point, but if fundamental things like the C++ compiler or Python are very old, it can't help too much.
(You could also use Conda to manage the Python dependency, of course. That might be easier.)
OK, I just did a test in a freshly installed RHEL 8.5 virtual machine and the instructions above result in a successful build. I just hope they work on your cluster too!
OK - following the excellent work by @ian-ross I can confirm that APCEMM will build and run with RHEL 8.5 as long as a) you have more recent GNU compilers and b) you have a more recent version of Python3. I'm still finding some very weird behavior but I believe that's an unrelated bug which I will raise shortly - so I think this is good to go.
Good. Glad that worked. I was worried there was going to be something else weird on your cluster, @sdeastham !
Some context for Michael: Seb asked me to look at making it more convenient to build APCEMM on different platforms. This PR is a first attempt at doing this, using
vcpkg
to manage external dependencies. It works pretty well. I can build on a clean Ubnutu system, on my development machine, which has Manjaro Linux, and I can even get part of the way to building on Windows.Build instructions
I have a fork of the APCEMM repo that uses
vcpkg
. First get the relevant fork and the relevant branch:Then set up
vcpkg
, which is installed as a Git submodule. This means that users don't need to install it themselves, just run this command when they clone the repo:Then do an out-of-tree build:
Timing
On my dev machine (fast, lots of memory, Manjaro Linux rolling release):
The "
git submodule update ...
" command takes about 35s to complete.The CMake run takes about 8m40s the first time you run it.
Subsequent executions of CMake in a fresh build directory take less than 10 seconds, because of
vcpkg
's binary caching.Ubuntu
In a fresh Unbuntu 22.04 desktop VM, all the setup you need to do is:
This really is all of the OS setup you need to do. The only other thing you need to do is to copy some SSH keys into the VM so you can clone the repo. Then you can follow the build instructions at the top, and they will work.
The CMake command here took 15m23s on the first run. As before, subsequent runs take less than 10s.
Windows
This is all done in a Windows 11 developer evaluation VM. It has MS Visual Studio 2022 (v17).
~/.ssh
. (At this point I usually do "ssh -T git@github.mit.edu
" to make sure the keys are set up right.)I didn't collect any timing information (it took a long time). Doing anything on Windows feels like the talking dog: you don't ask how well it does it, because the fact that it does it at all is a miracle. (It's possible that at some point CMake may have installed Python to build a dependency that used Meson for its build... I didn't really want to look at the details.)
In any case, the installation of all the dependencies worked fine, and I got to the point where I could run
cmake --build .
. Unfortunately, this failed with a bunch of compilation errors. I think MSVC++ is fussy about some things that g++ doesn't care about. For example,virtual constexpr
is only allowed in C++20, but g++ doesn't seem to care if you use it with the standard set to C++17. MSVC++ complains. That one was easy to fix by changing the language standard, but the next problems that came up were something to do with using OMP'sthreadprivate
in a header file, which would take some more investigation.Summary
So, partial success, and it probably wouldn't take too much more time to get APCEMM working on Windows. In the meantime, I believe that this PR should make it possible to build on more or less any Linux system without needing to install any external dependencies. (Even the Boost dependencies are managed via
vcpkg
.)