Closed sg0 closed 3 months ago
I'm sorry you're having difficulty getting things to compile. These examples were constructed assuming that you would build cugraph from source, so we haven't tested doing what you are attempting to do. I have done some work in this regard.
I can't reproduce your environment entirely. However I was able to do the following:
mamba install cmake
mamba install openmpi
build.sh all
in that top level directory and it was able to build the executable you are interested in. [Unfortunately, it will complain trying to build the developer tests... those actually depend on the release being built successfully from source, as it accesses the graph primitives library which isn't exported]Try adding the following to your compile line. If it doesn't work, let me know if you are seeing the same errors, or new errors (and what they are).
-DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DSPDLOG_FMT_EXTERNAL -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR
You're already setting SPDLOG_FMT_EXTERNAL
, don't need to specify it twice.
Thanks, I get a long list of errors, mostly redefinition errors, owing to CCCL (cuda-cccl
):
/people/ghos167/.conda/envs/cugraph-ldgpu2/include/cuda/std/detail/libcxx/include/__concepts/../__concepts/../__concepts/convertible_to.h:60:1: note: in expansion of macro ‘_LIBCUDACXX_CONCEPT_FRAGMENT’
757 60 | _LIBCUDACXX_CONCEPT_FRAGMENT(
758 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
759 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cuda/std/detail/libcxx/include/__concepts/../__concepts/../__concepts/convertible_to.h:63:14: error: there are no arguments to ‘_LIBCUDACXX_TRAIT’ that depend on a template parameter, so a declaration of ‘_LIBCUDACX X_TRAIT’ must be available [-fpermissive]
760 63 | requires(_LIBCUDACXX_TRAIT(is_convertible, _From, _To)),
761 | ^~~~~~~~~~~~~~~~~
762 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cuda/std/detail/libcxx/include/__concepts/__concept_macros.h:225:23: note: in definition of macro ‘_LIBCUDACXX_CONCEPT_FRAGMENT_REQS_REQUIRES_requires’
763 225 | _Concept::_Requires<__VA_ARGS__>
764 | ^~~~~~~~~~~
765 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cuda/std/detail/libcxx/include/__concepts/__concept_macros.h:41:39: note: in expansion of macro ‘_LIBCUDACXX_PP_CAT4_’
766 41 | #define _LIBCUDACXX_PP_CAT4(_Xp, ...) _LIBCUDACXX_PP_CAT4_(_Xp, __VA_ARGS__)
767 | ^~~~~~~~~~~~~~~~~~~~
768 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cuda/std/detail/libcxx/include/__concepts/__concept_macros.h:153:3: note: in expansion of macro ‘_LIBCUDACXX_PP_CAT4’
769 153 | _LIBCUDACXX_PP_CAT4(_LIBCUDACXX_CONCEPT_FRAGMENT_REQS_REQUIRES_, _REQ)
770 | ^~~~~~~~~~~~~~~~~~~
771 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cuda/std/detail/libcxx/include/__concepts/__concept_macros.h:147:3: note: in expansion of macro ‘_LIBCUDACXX_CONCEPT_FRAGMENT_REQS_REQUIRES_OR_NOEXCEPT’
772 147 | _LIBCUDACXX_CONCEPT_FRAGMENT_REQS_REQUIRES_OR_NOEXCEPT
773 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
774 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cuda/std/detail/libcxx/include/__concepts/__concept_macros.h:34:40: note: in expansion of macro ‘_LIBCUDACXX_CONCEPT_FRAGMENT_REQS_M0’
775 34 | #define _LIBCUDACXX_PP_CAT2_(_Xp, ...) _Xp##__VA_ARGS__
So, I uninstalled CCCL, and then retried:
(cugraph-ldgpu2) [ghos167@deception04 mg-graph]$ mpic++ -DSPDLOG_FMT_EXTERNAL -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/share/apps/cuda/12.1/include -I/people/ghos167/.conda/envs/cugraph-ldgpu2/include -std=c++17 -o mg_test mg_graph_algorithms.cpp -L/share/apps/cuda/12.1/lib -L/people/ghos167/.conda/envs/cugraph-ldgpu2/lib -lcuda -lcudart -lcugraph
In file included from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/device_uvector.hpp:19,
from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cugraph/dendrogram.hpp:18,
from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cugraph/algorithms.hpp:19,
from mg_graph_algorithms.cpp:17:
/people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/cuda_stream_view.hpp:21:10: fatal error: cuda/stream_ref: No such file or directory
21 | #include <cuda/stream_ref>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
So, above is the error why I installed CCCL.
Thanks for that update. I think I have a fix for that also, but I'm away from my computer for the night. I'll post something in the morning.
The problem you are seeing, I believe, is due to the fact that some of the header files are present in multiple directories. One drawback of the #pragma once
approach that most C++ developers have moved to for avoiding duplicate headers is that if the same header file appears in different directories it can actually be included twice - resulting in the duplicate symbols you are seeing.
You'll need to experiment a bit, since I can't exactly replicate your environment. Here's a combination of include options that worked for me.
-I/raid/charlesh/mambaforge/envs/test_issue_4596/include/rapids -I/raid/charlesh/mambaforge/envs/test_issue_4596/include/rapids/libcudacxx -isystem /raid/charlesh/mambaforge/envs/test_issue_4596/include -isystem /raid/charlesh/mambaforge/envs/test_issue_4596/targets/x86_64-linux/include
This link provides some explanation of the -I
vice -isystem
motivation. Short version is -I
is searched first, then -isystem
, then system libraries. The objective is to separate things so that the duplicate header files are at a different level (only one in the -I directories, any others in a -isystem directory or one of the system libraries).
Obviously change the directory path to point to your conda environment. A RAPIDS installation should have the first 2 elements as part of your conda environment. That should give you the proper versions of thrust, cub, CCCL. The third item in the list will get you cugraph and any other conda packages installed. The last item I think is required to pick up some of the headers specific to x86_64 architectures (some of the implementation details).
Thanks, I encountered a long list of errors with -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE
in the updated includes after your suggestion, so I removed it:
mpic++ -DSPDLOG_FMT_EXTERNAL -DFMT_HEADER_ONLY=1 -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/share/apps/cuda/12.1/include -I/people/ghos167/.conda/envs/cugraph-ldgpu2/include/rapids -I/people/ghos167/.conda/envs/cugraph-ldgpu2/include/rapids/libcudacxx -isystem /people/ghos167/.conda/envs/cugraph-ldgpu2/include -isystem /people/ghos167/.conda/envs/cugraph-ldgpu2/targets/x86_64-linux/include -std=c++17 -o mg_test mg_graph_algorithms.cpp -L/share/apps/cuda/12.1/lib -L/people/ghos167/.conda/envs/cugraph-ldgpu2/lib -lcuda -lcudart -lcugrap
But then, there are these CUDA namespace related issues in RMM:
In file included from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/device_uvector.hpp:19,
2 from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cugraph/dendrogram.hpp:18,
3 from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/cugraph/algorithms.hpp:19,
4 from mg_graph_algorithms.cpp:17:
5 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/cuda_stream_view.hpp:67:30: error: ‘cuda’ has not been declared
6 67 | constexpr cuda_stream_view(cuda::stream_ref stream) noexcept : stream_{stream.get()} {}
7 | ^~~~
8 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/cuda_stream_view.hpp:67:46: error: expected ‘)’ before ‘stream’
9 67 | constexpr cuda_stream_view(cuda::stream_ref stream) noexcept : stream_{stream.get()} {}
10 | ~ ^~~~~~~
11 | )
12 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/cuda_stream_view.hpp:67:88: error: expected unqualified-id before ‘{’ token
13 67 | constexpr cuda_stream_view(cuda::stream_ref stream) noexcept : stream_{stream.get()} {}
14 | ^
15 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/cuda_stream_view.hpp:88:22: error: ‘cuda’ does not name a type; did you mean ‘cudaPos’?
16 88 | constexpr operator cuda::stream_ref() const noexcept { return value(); }
17 | ^~~~
18 | cudaPos
19 In file included from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/mr/device/cuda_memory_resource.hpp:20,
20 from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/mr/device/per_device_resource.hpp:21,
21 from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/device_buffer.hpp:21,
22 from /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/device_uvector.hpp:22:
23 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/mr/device/device_memory_resource.hpp:310:59: error: ‘cuda’ has not been declared
24 310 | friend void get_property(device_memory_resource const&, cuda::mr::device_accessible) noexcept {}
25 | ^~~~
26 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/mr/device/device_memory_resource.hpp:359:15: error: ‘cuda’ has not been declared
27 359 | static_assert(cuda::mr::async_resource_with<device_memory_resource, cuda::mr::device_accessible>);
28 | ^~~~
29 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/mr/device/device_memory_resource.hpp:359:67: error: expected primary-expression before ‘,’ token
30 359 | static_assert(cuda::mr::async_resource_with<device_memory_resource, cuda::mr::device_accessible>);
31 | ^
32 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/mr/device/device_memory_resource.hpp:359:69: error: expected string-literal before ‘cuda’
33 359 | static_assert(cuda::mr::async_resource_with<device_memory_resource, cuda::mr::device_accessible>);
34 | ^~~~
35 /people/ghos167/.conda/envs/cugraph-ldgpu2/include/rmm/mr/device/device_memory_resource.hpp:359:68: error: expected ‘)’ before ‘cuda’
36 359 | static_assert(cuda::mr::async_resource_with<device_memory_resource, cuda::mr::device_accessible>);
37 | ~ ^~~~~
38 | )
I also tried building RMM separately, but see the previous errors.
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:171:79: error: invalid conversion from ‘rmm::mr::device_memory_resource*’ to ‘int’ [-fpermissive]
171 | device_async_resource_ref mr = mr::get_current_device_resource())
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
| |
| rmm::mr::device_memory_resource*
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp: In constructor ‘rmm::device_buffer::device_buffer()’:
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:97:21: error: class ‘rmm::device_buffer’ does not have any field named ‘_mr’
97 | device_buffer() : _mr{rmm::mr::get_current_device_resource()} {}
| ^~~
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp: In constructor ‘rmm::device_buffer::device_buffer(std::size_t, rmm::cuda_stream_view, int)’:
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:112:24: error: class ‘rmm::device_buffer’ does not have any field named ‘_mr’
112 | : _stream{stream}, _mr{mr}
| ^~~
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp: In constructor ‘rmm::device_buffer::device_buffer(const void*, std::size_t, rmm::cuda_stream_view, int)’:
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:141:24: error: class ‘rmm::device_buffer’ does not have any field named ‘_mr’
141 | : _stream{stream}, _mr{mr}
| ^~~
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp: In constructor ‘rmm::device_buffer::device_buffer(rmm::device_buffer&&)’:
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:192:7: error: class ‘rmm::device_buffer’ does not have any field named ‘_mr’
192 | _mr{other._mr},
| ^~~
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:192:17: error: ‘class rmm::device_buffer’ has no member named ‘_mr’
192 | _mr{other._mr},
| ^~~
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp: In member function ‘rmm::device_buffer& rmm::device_buffer::operator=(rmm::device_buffer&&)’:
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:226:7: error: ‘_mr’ was not declared in this scope; did you mean ‘mr’?
226 | _mr = other._mr;
| ^~~
| mr
/people/ghos167/builds/rmm-cuda-12.1/include/rmm/device_buffer.hpp:226:23: error: ‘class rmm::device_buffer’ has no member named ‘_mr’
226 | _mr = other._mr;
Try moving -I/share/apps/cuda/12.1/include
to be -isystem /share/apps/cuda/12.1/include
, and maybe put it last in the order.
The error looks like you're picking up a different version of a cuda file.
Try moving
-I/share/apps/cuda/12.1/include
to be-isystem /share/apps/cuda/12.1/include
, and maybe put it last in the order.The error looks like you're picking up a different version of a cuda file.
Nevermind. This error occurs when you don't include -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE
, see need that flag enabled.
What errors are you seeing when you enable that flag?
Upon close inspection, I noticed that the errors were "multiple redefinition errors", so I moved the CUDA runtime headers later as you had suggested. Then it worked:
mpic++ -DSPDLOG_FMT_EXTERNAL -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/people/ghos167/.conda/envs/cugraph-ldgpu2/include/rapids -I/people/ghos167/.conda/envs/cugraph-ldgpu2/include/rapids/libcudacxx -isystem /people/ghos167/.conda/envs/cugraph-ldgpu2/include -isystem /people/ghos167/.conda/envs/cugraph-ldgpu2/targets/x86_64-linux/include -isystem /share/apps/cuda/12.1/include -std=c++17 -o mg_test mg_graph_algorithms.cpp -L/share/apps/cuda/12.1/lib -L/people/ghos167/.conda/envs/cugraph-ldgpu2/lib -ldl -lcudart -lcugraph -lnccl
Thanks for the suggestions, I am closing the issue.
Great. Glad you were able to resolve this.
What is your question?
I have been trying to build a C++-based multi-GPU example from cugraph on a Linux cluster:
https://github.com/rapidsai/cugraph/blob/branch-24.10/cpp/examples/users/multi_gpu_application/mg_graph_algorithms.cpp
But I am encountering several “not declared in scope” issues, such as the following, which suggests that I am probably not passing some path or missing a dependency:
Since building cugraph from source is time consuming (I gave up after ~6 hours), I decided to pull the packages via conda (attached the yml file). Here are the modules on my platform:
I am trying to build using:
cugraph-ldgpu2.yml.txt
Code of Conduct