NVIDIA / cccl

CUDA Core Compute Libraries
https://nvidia.github.io/cccl/
Other
1.32k stars 167 forks source link

[BUG]: Proclaiming copyable arguments for lambdas fails to compile #2834

Closed bernhardmgruber closed 2 days ago

bernhardmgruber commented 2 weeks ago

Is this a duplicate?

Type of Bug

Compile-time Error

Component

libcu++

Describe the bug

The following code fails to compile:

#include <thrust/transform.h>
#include <thrust/device_vector.h>

using T = int;

int main() {
    const auto n = 16354564;
    thrust::device_vector<T> a(n, 424);
    thrust::device_vector<T> b(n, 74565);
    thrust::device_vector<T> c(n);

    thrust::transform(a.begin(), a.end(), b.begin(), c.begin(),
        cuda::proclaim_copyable_arguments(
            [] __host__ __device__ (const T& lhs, const T& rhs) {
                return lhs + rhs;
        }));
}

How to Reproduce

https://godbolt.org/z/KbPKaoxW8

Expected behavior

It should succeed to compile.

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

No response

bernhardmgruber commented 2 weeks ago

Surprisingly, it also fails for ordinary function objects: https://godbolt.org/z/EqbjzeT9h

bernhardmgruber commented 1 week ago

Alright, this is a problem with the host compiler g++-10, which is used on compiler explorer. I can reproduce the problem locally. With g++-11:

bgruber@concorde:~/dev$ cat proclaim.cu 
#include <thrust/transform.h>
#include <thrust/device_vector.h>

using T = int;

int main() {
    const auto n = 16354564;
    thrust::device_vector<T> a(n, 424);
    thrust::device_vector<T> b(n, 74565);
    thrust::device_vector<T> c(n);

    thrust::transform(a.begin(), a.end(), b.begin(), c.begin(),
        cuda::proclaim_copyable_arguments(
            [] __host__ __device__ (const T& lhs, const T& rhs) {
                return lhs + rhs;
        }));
}
bgruber@concorde:~/dev$ /usr/local/cuda-12.6/bin/nvcc -ccbin g++-11 --extended-lambda -Icccl/cub -Icccl/thrust -Icccl/libcudacxx/include ./proclaim.cu 

it works, but changing to g++-10:

bgruber@concorde:~/dev$ /usr/local/cuda-12.6/bin/nvcc -ccbin g++-10 --extended-lambda -Icccl/cub -Icccl/thrust -Icccl/libcudacxx/include ./proclaim.cu 
cccl/libcudacxx/include/cuda/__functional/address_stability.h(66): error: no instance of constructor "cuda::__4::__callable_permitting_copied_arguments<F>::__callable_permitting_copied_arguments [with F=lambda [](const T &, const T &)->int]" matches the argument list
            argument types are: (lambda [](const T &, const T &)->int)
    return __callable_permitting_copied_arguments<F>{::cuda::std::__4::move(f)};
                                                    ^
cccl/libcudacxx/include/cuda/__functional/address_stability.h(49): note #3326-D: function "cuda::__4::__callable_permitting_copied_arguments<F>::__callable_permitting_copied_arguments(const cuda::__4::__callable_permitting_copied_arguments<lambda [](const T &, const T &)->int> &) [with F=lambda [](const T &, const T &)->int]" does not match because argument #1 does not match parameter
  struct __callable_permitting_copied_arguments : F
         ^
cccl/libcudacxx/include/cuda/__functional/address_stability.h(49): note #3326-D: function "cuda::__4::__callable_permitting_copied_arguments<F>::__callable_permitting_copied_arguments(cuda::__4::__callable_permitting_copied_arguments<lambda [](const T &, const T &)->int> &&) [with F=lambda [](const T &, const T &)->int]" does not match because argument #1 does not match parameter
  struct __callable_permitting_copied_arguments : F
         ^
cccl/libcudacxx/include/cuda/__functional/address_stability.h(49): note #3322-D: number of parameters of function "cuda::__4::__callable_permitting_copied_arguments<F>::__callable_permitting_copied_arguments() [with F=lambda [](const T &, const T &)->int]" does not match the call
  struct __callable_permitting_copied_arguments : F
         ^
          detected during instantiation of "auto cuda::__4::proclaim_copyable_arguments(F)->cuda::__4::__callable_permitting_copied_arguments<F> [with F=lambda [](const T &, const T &)->int]" at line 16 of ./proclaim.cu

1 error detected in the compilation of "./proclaim.cu".
bernhardmgruber commented 1 week ago

We need to implement a workaround for g++-10.