Open pauleonix opened 2 months ago
I believe this is actually a very intricate bug in the compiler where device lambdas interact strangely with deduced return types of the invoke machinery.
This comes from instantiating result_of_adaptable_function
, where it instantiates __invoke_of
of the proclaim_return_type
wrapped lambda.
However, it does not use the explicit return type of proclaim_return_type
but tries to match that with the "return type" of the device lambda which is not the actual return type
@pauleonix I tried working around this a bit more but it seems that there is indeed a compiler bug that we need to reduce.
In the meantime you can work around the issue by adding a trailing return type to the lambda
Note that depending on you CTK version you might be able to completely skip the proclaim_return_type
workaround fully in that case (CTK 12.4 and above)
Note that depending on you CTK version you might be able to completely skip the
proclaim_return_type
workaround fully in that case (CTK 12.4 and above)
That is great news (to me)! 🎉
Is this a duplicate?
Type of Bug
Compile-time Error
Component
Not sure
Describe the bug
With CCCL 2.2.0 it was possible to combine these two on a device lambda to take input from a
zip_iterator
. Since CCCL 2.3.x/CUDA 12.4 this does not work anymore when returning athrust::tuple
and the order isSwapping the two seems to have solved the issue for me, i.e.
The compiler error is
How to Reproduce
The reproducer is basically a one-liner. I chose to
thrust::tuple<int, int>
as both output and input of the device lambda which is just an identity operation here.I also needed to combine this construct with an actual
thrust::zip_iterator
for the compiler error to materialize in the reproducer. Therefore I added a combination oftransform_iterator
,zip_iterator
andcounting_iterator
s.Expected behavior
It would be nice if this would compile independent of order as it has with CCCL 2.2.0. Naively the failing ordering seems to make more sense because then the compiler knows what return type the
zip_function
should "inherit".Reproduction link
https://cuda.godbolt.org/z/McznoKnGx
Operating System
No response
nvidia-smi output
No response
NVCC version
No response