supranational / sppark

Zero-knowledge template library
Apache License 2.0
186 stars 66 forks source link

Compilation error (observed from pasta-msm) #32

Open huitseeker opened 11 months ago

huitseeker commented 11 months ago

TL;DR: compilation error on a cargo check

I upgraded nvcc to 12.3 according to the hints in https://github.com/supranational/pasta-msm/commit/8ccdc45ca5938168b0a3f67462765926a0da9b40 and got a compilation error on a simple cargo check:

The warnings are several instances of the following warning :

warning: pasta-msm@0.1.4: /usr/include/c++/13.2.1/bits/std_function.h: In instantiation of ‘std::function<_Res(_ArgTypes ...)>::_Requires<std::function<_Res(_ArgTypes ...)>::_Callable<_Functor>, std::function<_Res(_ArgTypes ...)>&> std::function<_Res(_ArgTypes ...)>::operator=(_Functor&&) [with _Functor = std::function<void()>&; _Res = void; _ArgTypes = {}; _R
equires<_Callable<_Functor>, std::function<_Res(_ArgTypes ...)>&> = std::function<void()>&; typename std::enable_if<(! std::is_same<typename std::remove_cv<typename std::remove_reference<_Tuple>::type>::type, std::function<_Res(_ArgTypes ...)> >::value), std::decay<_Func> >::type::type = std::function<void()>; typename std::enable_if<(! std::is_same<typename s
td::remove_cv<typename std::remove_reference<_Tuple>::type>::type, std::function<_Res(_ArgTypes ...)> >::value), std::decay<_Func> >::type = std::decay<std::function<void()>&>; typename std::remove_cv<typename std::remove_reference<_Tuple>::type>::type = std::function<void()>&; typename std::remove_reference<_Tuple>::type = std::function<void()>&]’:
warning: pasta-msm@0.1.4: /home/huitseeker/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sppark-0.1.5/sppark/util/thread_pool_t.hpp:164:24:   required from here
warning: pasta-msm@0.1.4: /usr/include/c++/13.2.1/bits/std_function.h:534:44: error: call of overloaded ‘forward<std::function<void()>&>(std::function<void()>&)’ is ambiguous
warning: pasta-msm@0.1.4:   534 |           function(std::forward<_Functor>(__f)).swap(*this);
warning: pasta-msm@0.1.4:       |                   ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
warning: pasta-msm@0.1.4: /usr/include/c++/13.2.1/bits/move.h:70:1: note: candidate: ‘constexpr _Tp&& std::forward(typename remove_reference<_Functor>::type&) [with _Tp = function<void()>&; typename remove_reference<_Functor>::type = function<void()>&]’
warning: pasta-msm@0.1.4:    70 |     forward(typename std::remove_reference<_Tp>::type& __t) noexcept
warning: pasta-msm@0.1.4:       | ^   ~~~
warning: pasta-msm@0.1.4: /usr/include/c++/13.2.1/bits/move.h:82:1: note: candidate: ‘constexpr _Tp&& std::forward(typename remove_reference<_Functor>::type&&) [with _Tp = function<void()>&; typename remove_reference<_Functor>::type = function<void()>&]’
warning: pasta-msm@0.1.4:    82 |     forward(typename std::remove_reference<_Tp>::type&& __t) noexcept
warning: pasta-msm@0.1.4:       | ^   ~~~
warning: pasta-msm@0.1.4: /usr/include/c++/13.2.1/bits/vector.tcc: In instantiation of ‘std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::emplace_back(_Args&& ...) [with _Args = {std::thread&}; _Tp = std::thread; _Alloc = std::allocator<std::thread>; reference = std::thread&]’:
warning: pasta-msm@0.1.4: /usr/include/c++/13.2.1/bits/stl_vector.h:1296:15:   required from ‘void std::vector<_Tp, _Alloc>::push_back(value_type&&) [with _Tp = std::thread; _Alloc = std::allocator<std::thread>; value_type = std::thread]’
warning: pasta-msm@0.1.4: /home/huitseeker/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sppark-0.1.5/sppark/util/thread_pool_t.hpp:79:20:   required from here
...

Full output below.

Details

Version info ``` huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» nvcc --version [10:04:07] nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Fri_Sep__8_19:17:24_PDT_2023 Cuda compilation tools, release 12.3, V12.3.52 Build cuda_12.3.r12.3/compiler.33281558_0 huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» gcc --version [10:06:00] gcc (GCC) 13.2.1 20230730 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» uname -a [10:06:30] Linux sleipnir 6.1.24-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 13 Apr 2023 17:22:35 +0000 x86_64 GNU/Linux ```
dot-asm commented 11 months ago

Hmmm... Is it actually so that nvcc is the only new component? I mean I'm using 12.3 and it's perfectly all right. Well, I don't have gcc 13, and it seems that it's gcc 13 that is the culprit...

dot-asm commented 11 months ago

Moreover, if I put together gcc 13.2.0 and nvcc 12.3 it says #error -- unsupported GNU version! gcc versions later than 12 are not supported. This is in a ubuntu:devel docker container... How come it lets you through with just a warning? Which distribution ~is it~ do you run?

huitseeker commented 11 months ago

My indication about updating nvcc 12.1 -> 12.3 was just for context, it's indeed not the only moving piece.

Here's the output of cargo clean && CC=/usr/lib/llvm14/bin/clang CXX=/usr/lib/llvm14/bin/clang++ cargo check: https://gist.github.com/huitseeker/f65afe92e9275daa9e1a85ef8eaca698

huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» /usr/lib/llvm14/bin/clang++ --version                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   [11:15:02]
clang version 14.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm14/bin

That of cargo clean && CC=/usr/bin/clang CXX=/usr/bin/clang++ cargo check: https://gist.github.com/huitseeker/11c3e4cf73d8fc7a12424dfa53824deb

That later one is versioned thus:

huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» /usr/bin/clang++ --version                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              [11:13:03]
clang version 16.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Those warnings are indeed different from the gcc ouput.

dot-asm commented 11 months ago

Incidentally clang++ warnings were spotted today, as result of introduction of the .github/workflows/ci.yml that exercises both c++ and clang++. They will be resolved at a later point. But how do I reproduce the problem with gcc? Well, we're likely to pull the "unsupported" card on the basis of the #error -- unsupported GNU version!, question is rather if it's a learning opportunity [to make it future-proof]...

huitseeker commented 11 months ago

I'm running on Archlinux, which installation of CUDA indeed removes this warning: https://gitlab.archlinux.org/archlinux/packaging/packages/cuda/-/blob/main/PKGBUILD?ref_type=heads#L130

With that said, despite NVIDIA being extra cautious, this has worked for about a year: https://gitlab.archlinux.org/archlinux/packaging/packages/cuda/-/commit/5c0d3a90e04541f8ab011b616c3f29b01c8af74b

On the other hand, could it be that replacing: https://github.com/supranational/sppark/blob/8ae27a6ce4586323eb383031c317bd4ec88c4895/util/thread_pool_t.hpp#L146-L148 with:

    std::unique_lock<std::mutex> lock(mtx);
    std::function<void()> func = work;
    fifo.emplace_back(std::move(func));
    cvar.notify_one();  // wake up a worker thread

Would suffice to lift the ambiguity?

dot-asm commented 11 months ago

I'm running on Archlinux, which installation of CUDA indeed removes this warning:

Well, then it's Archlinux's responsibility, isn't it? :-) But on a serious note, you have to agree that it's not really reasonable to expect ~this~ that we can do better than the vendor [Nvidia in the context]. Nothing prevents you from going "vigilante," but you have to accept the consequences. On a more practical note, it should be noted that gcc 13.2 has no problem compiling sppark itself, which compiles thread_pool_t.hpp, or pippenger.cpp, which also compiles thread_pool_t.hpp. It's specifically nvcc that manages to push gcc buttons and make it angry. Note that it complains even about this line. Note that in the pallas.cu context the line in question is merely parsed. I mean the compiler won't generate actual code for it... I might take another stab at it, unraveling the compilation path, if time permits, but for now I'm pulling the "unsupported" card. (But keep it open as a reminder ;-)

slumber commented 11 months ago

Archlinux cuda depends on gcc12 https://archlinux.org/packages/extra/x86_64/cuda/

Therefore CC=gcc-12 CXX=g++-12 cargo build if anyone comes across this issue.

dot-asm commented 11 months ago

CC=gcc-12 CXX=g++-12 cargo build if anyone comes across this issue.

Right, this is what "pulling the "unsupported" card" means. I mean "don't use unsupported compiler" is logically equivalent to "use a supported one instead." BTW, setting CXX alone is sufficient.

On a related note, formally speaking setting CXX to a version other than system default is not actually a guarantee for success. The trouble is that rustc will call cc to link the final binary, and the CC variable has no effect on it. While mixing g++-12 and cc-13 as linker is a virtually-no-risk operation, you have to recognize it, and keep in mind that if you run into a problem, no matter how unlikely, it might be appropriate to instruct rustc to use the matching linker.

dot-asm commented 5 months ago

Latest nvcc works with gcc up to 13.x and clang up 17.x.