Increase required C++ version to C++20 and CUDA version to 12

jngrad commented 8 months ago

New features in C++20

designated initializers and 3-way comparison (spaceship operator <=>)

#include <compare>

struct S {
int i;
int j;
float f;
constexpr auto operator <=>(const S&) const = default;
};

void foo() {
constexpr S s1{.i = 2, .j = 3, .f = 3.14f};
constexpr S s2{.i = 2, .j = 3, .f = 3.14f * 2.f};
static_assert(!(s1 == s2));
static_assert( (s1 != s2));
static_assert( (s1 <  s2));
static_assert( (s1 <= s2));
static_assert(!(s1 >  s2));
static_assert(!(s1 >= s2));
}

constraints and concepts

#include <concepts>

auto add(std::integral auto lhs, std::integral auto rhs) {
return lhs + rhs;
}

mathematical constants

string formatter (Python syntax)

#include <iostream>
#include <format>
#include <numbers>

int main() {
std::cout << "Python formatting syntax: ";
std::cout << std::format("π≈{:.4f}", std::numbers::pi_v<double>) << "\n";
}

Output:

Python formatting syntax:π≈3.1416

ranges

#include <ranges>
#include <vector>
#include <algorithm>
#include <iostream>

int main() {
std::cout << "Print the square of even numbers from a sorted list:\n";
std::vector<int> values{1, 4, 5, 9, 2};
std::ranges::sort(values);
for (auto const val : values | std::views::filter([](int i) {return i%2 == 0;})
                             | std::views::transform([](int i) {return i*i;})) {
  std::cout << val << "\n";
}
}

Output:

Print the square of even numbers from a sorted list:
4
16

std::span to abstract away std::vector and std::array in function arguments
constexpr std::vector and std::string in GCC 12+ and Clang15+ (only inside a constexpr function)
constexpr algorithms (sorting, finding, etc.)
std::has_single_bit()
std::unordered_map::contains()

Not fully supported by all compilers yet:

coroutines
modules
optional typenames
consteval functions

See compiler support tables for more details.

Applicability

We can replace all calls to std::to_string by corresponding calls to std::format. The standard function std::to_string is known to be ill-suited to represent floating-point numbers outside the range [1e6, 1e-6] due to the decimal format it uses, among other issues outlined in draft proposal [D2587R1]. In fact, std::to_string will be redefined aroundstd::format in C++26, but without addressing the underlying issue of precision loss in small numbers. std::to_string is used 70 times in ESPResSo due to its convenience over alternatives based on std::stringstream or snprintf, approximately 20 of which involve a floating-point value. Likewise, several calls to std::stringstream are used to format error messages and could be replaced by a corresponding std::format call.

A lot of C++20 features were backported to ESPResSo in the Utils namespace about 6 years ago, which require extensive testing and in a few rare cases make it difficult to interface ESPResSo to third-party C++ libraries. We chose not to use equivalent features in Boost because some of these features were not just backported to C++14, but also ported to CUDA 9. These features are now available in the standard library and in CUDA 12. In particular, std::span can replace Utils::Span, mathematical constants can replace most of the values defined in utils/constants.hpp, std::unordered_map::contains() can replace a lot of boilerplate code involving iterators or element counting, and std::has_single_bit() can replace hard-to-read bit operations to detect whether thermo_switch and similar bitfields have only one bit set (std::bitset is not used for performance reasons).

Concepts can replace trivial template declarations to help generate more helpful compiler error messages. Designated initializers can be used to help disambiguate constructor calls involving multiple arguments sharing the same type.

Prior work

ESPResSo is C++20-ready since b1f59e0da8f27d27ca00b5acd517e2ce961cdd2b and is tested in CI for builds without CUDA.

Requirements

We probably need to drop support for CUDA 11, which doesn't support C++20, or find a way to compile CUDA code in C++17 mode and C++ code in C++20 mode via CMake options. We already require a minimum of CUDA 11.3 when the compiler is Clang because recent GPUs with architectures sm_70+ require Thrust 1.11. For GCC the situation is a bit less clear, we currently require CUDA 11.0 but only test 11.5 in CI.

Bumping CUDA requirements to 12.0 would be the easiest solution. Ubuntu 24.04 ships CUDA 12.0 via nvidia-cuda-toolkit. Many compute clusters have migrated to CUDA 12.0 in May of 2023 to mitigate several CVEs (full list), namely: bwUniCluster 2.0, bwForCluster JUSTUS 2, bwForCluster Helix, HLRS Vulcan, HPC Vega, Jülich JUWELS.

We would have to drop support for older compilers and require GCC 10+ and Clang 14+ (or Clang 17+ to build CUDA code without nvcc). I double-checked with our EasyBuild partners and increasing the compiler version requirements wouldn't be an issue for them. Several packages in gompi/2023a have already migrated to CUDA 12.1.

Course of action

Update CMAKE_CXX_STANDARD to 20
Drop support for CUDA 11, which doesn't support C++20.
Figure out which GCC and Clang toolchains are compatible with CUDA 12.

Timeline: probably best to defer this change until we migrate to the Ubuntu 24.04 migration at the home institute. This way we don't need to sort out compiler toolchains twice.

jngrad commented 8 months ago

Compatible toolchains to build C++ and CUDA code in Ubuntu noble:

GCC 10.5.0 with NVCC 12.0
GCC 11.4.0 with NVCC 12.0
GCC 12.3.0 with NVCC 12.0
GCC 13.2.0 with NVCC 12.0
Clang 14.0.6 with NVCC 12.0
Clang 17.0.6 for both C++ and CUDA code

jngrad commented 8 months ago

When using Clang as the CUDA compiler, extra steps need to be taken when running the initial CMake configuration.

CMake cannot automatically detect default architectures:

CMake Error at /usr/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:603 (message):
  Failed to detect a default CUDA architecture.

  Compiler output:

Solution: add -D CMAKE_CUDA_ARCHITECTURES="61;75" to the CMake command to specify which architectures to build for.

CMake fails to execute enable_language(CUDA):

CMake Error at /usr/share/cmake-3.27/Modules/CMakeDetermineCompilerABI.cmake:57 (try_compile):
  Failed to generate test project build system.

Solution: only Clang 17 and Clang 18 are compatible with CUDA 12. Use for example CUDACXX=clang++-17.

Here is a minimal CMake configuration with Clang:

CC=clang-17 CXX=clang++-17 CUDACXX=clang++-17 /usr/bin/cmake .. \
    -D CMAKE_CUDA_ARCHITECTURES="61;75" -D CUDAToolkit_ROOT="/usr/lib/cuda"

A compiler warning will be generated during the build:

clang++-17: warning: CUDA version 12.0 is only partially supported [-Wunknown-cuda-version]

jngrad commented 3 months ago

For GCC 13.2.0, one has to silence a compiler error with -D CMAKE_CUDA_FLAGS=-allow-unsupported-compiler. When compiled with this flag, the main GPU algorithms in ESPResSo (P3M, DDS, LB) still work as expected.

espressomd / espresso

Increase required C++ version to C++20 and CUDA version to 12 #4846