rapidsai / rapids-cmake

https://docs.rapids.ai/api/rapids-cmake/stable/
Apache License 2.0
29 stars 46 forks source link

Bump CCCL version to include cuda::std::span fix #631

Closed sleeepyjack closed 3 months ago

sleeepyjack commented 3 months ago

Description

This PR updates the CCCL version to include a fix for cuda::std::span which is required for cuCollections to work properly with CCCL 2.5.0.

Most of the changes between the last CCCL version bump (#607) and this one were related to doc updates and unit test fixes, so I don't expect much functional impact for RAPIDS.

After this PR we likely have to bump the cuco version again to include the new changes.

CCCL PR:

CUCO PR:

RAPIDS PRs:

Checklist

bdice commented 3 months ago

The diff is here: https://github.com/NVIDIA/cccl/compare/fde1cf79bde6744b6739636f502b5edfefe302e1...e21d607157218540cd7c45461213fb96adf720b7

I feel confident in merging this as soon as you think it's sufficiently tested @sleeepyjack @PointKernel. All of the commits up to those from yesterday (which unfortunately includes the fix you need) were tested by the latest run of https://github.com/NVIDIA/cccl/pull/1667.

PointKernel commented 3 months ago

Added rmm PR as well: https://github.com/rapidsai/rmm/pull/1584 since the raft failure (https://github.com/rapidsai/raft/pull/2358) pointed to an rmm invocation:

Thread 1 "CORE_TEST" hit Catchpoint 1 (exception thrown), 0x00007fffb24824a1 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(cuda-gdb) bt 5
#0  0x00007fffb24824a1 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x0000555555654a0a in rmm::mr::limiting_resource_adaptor<rmm::mr::device_memory_resource>::do_allocate(unsigned long, rmm::cuda_stream_view) ()
#2  0x00005555556c3d8f in void* cuda::mr::__4::_Resource_vtable_builder::_Alloc_async<rmm::mr::limiting_resource_adaptor<rmm::mr::device_memory_resource> >(void*, unsigned long, unsigned long, cuda::__4::stream_ref) ()
#3  0x00005555556becd1 in raft::Raft_WorkspaceResource_Test::TestBody() ()
#4  0x00005555557cfee1 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
sleeepyjack commented 3 months ago

cugraph CI has some problems. I restarted it one more time. Fingers crossed. Apart from that all other projects seem to be fine with the change.

sleeepyjack commented 3 months ago

cugraph now finally also passes all unit tests. What's going on with the failing rapids-cmake tests? I can't make sense of them and thus would like a second pair of eyes to verify if it's related to this change.

sleeepyjack commented 3 months ago

All problems have been resolved. This PR is ready for review.

bdice commented 3 months ago

/merge

bdice commented 3 months ago

Thanks @sleeepyjack!