Open mrprajesh opened 8 months ago
There is an option --without_cugraphops
which you can add to the build command which will skip over the cugraph ops dependency. That will cause some of the sampling algorithms (which rely on some closed-source cugraph-ops features) to fail. But everything else (including SSSP) will function properly.
So you can try:
./build.sh clean
./build.sh libcugraph --without_cugraphops
and that should do what you want.
Thanks @ChuckHastings, After installing NCCL, I was able to move past the NCCL error. However, my chrome/cinnoman/laptop nearly crashed while spitting more errors (below) during build.
git clone -b v24.04.00 https://github.com/rapidsai/cugraph.git
cd cugraph/
./build.sh clean
./build.sh libcugraph --without_cugraphops
#NCCL Error
CMake Error at /home/rajesh/install/cmake-3.28.3-linux-x86_64/share/cmake-3.28/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find NCCL (missing: NCCL_LIBRARY NCCL_INCLUDE_DIR)
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt update
sudo apt install libnccl2 libnccl-dev
./build.sh clean
./build.sh libcugraph --without_cugraphops
[1/632] Building CUDA object CMakeFiles/cugraph.dir/src/community/detail/refine_mg.cu.o
FAILED: CMakeFiles/cugraph.dir/src/community/detail/refine_mg.cu.o
/usr/local/cuda-12.2/bin/nvcc -forward-unknown-to-host-compiler -DCUDA_API_PER_THREAD_DEFAULT_STREAM -DCUTLASS_NAMESPACE=raft_cutlass -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DRAFT_COMPILED -DRAFT_SYSTEM_LITTLE_ENDIAN=1 -DSPDLOG_FMT_EXTERNAL -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -Dcugraph_EXPORTS -I/home/rajesh/temp/cugraph/cpp/../thirdparty -I/home/rajesh/temp/cugraph/cpp/src -I/home/rajesh/temp/cugraph/cpp/include -I/home/rajesh/temp/cugraph/cpp/build/_deps/rmm-src/include -I/home/rajesh/temp/cugraph/cpp/build/_deps/cccl-src/thrust/thrust/cmake/../.. -I/home/rajesh/temp/cugraph/cpp/build/_deps/cccl-src/libcudacxx/lib/cmake/libcudacxx/../../../include -I/home/rajesh/temp/cugraph/cpp/build/_deps/cccl-src/cub/cub/cmake/../.. -I/home/rajesh/temp/cugraph/cpp/build/_deps/fmt-src/include -I/home/rajesh/temp/cugraph/cpp/build/_deps/spdlog-src/include -I/home/rajesh/temp/cugraph/cpp/build/_deps/raft-src/cpp/include -I/home/rajesh/temp/cugraph/cpp/build/_deps/cuco-src/include -I/home/rajesh/temp/cugraph/cpp/build/_deps/nvidiacutlass-src/include -I/home/rajesh/temp/cugraph/cpp/build/_deps/nvidiacutlass-build/include -I/usr/local/cuda-12.2/include -isystem /usr/local/cuda-12.2/targets/x86_64-linux/include -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_86,code=[sm_86]" -Xcompiler=-fPIC --expt-extended-lambda --expt-relaxed-constexpr -Werror=cross-execution-space-call -Wno-deprecated-declarations -Xptxas=--disable-warnings -Xcompiler=-Wall,-Wno-error=sign-compare,-Wno-error=unused-but-set-variable -Xfatbin=-compress-all -DNO_CUGRAPH_OPS -MD -MT CMakeFiles/cugraph.dir/src/community/detail/refine_mg.cu.o -MF CMakeFiles/cugraph.dir/src/community/detail/refine_mg.cu.o.d -x cu -c /home/rajesh/temp/cugraph/cpp/src/community/detail/refine_mg.cu -o CMakeFiles/cugraph.dir/src/community/detail/refine_mg.cu.o
Killed
[2/632] Building CUDA object CMakeFiles/cugraph.dir/src/community/detail/refine_sg.cu.o
FAILED:
Thank you for your patience and your assistance. Kind regards, Rajesh
Sorry, I didn't fully read your original input, let me answer these first, then I'll answer your most recent question.
I understood that ops is a closed source. So, I even tried from the conda env, which had
lincugraphops
installed, however, that gave a different error withnccl
INCLUDE_DIR vars. Could you please clarify the following?
- Is the
cpp
version usable or buildable atv24.x
? or do we have support only forpy version
?
Yes, each branch should be usable/buildable (cpp or python). 24.02 and 24.04 are released branches and should work fine. 24.06 is the latest code and subject to change, however based on our development/CI process our latest branch should also be buildable unless one of our dependencies has changed and we haven't updated to reflect that change yet.
- Can we build
cugraph
from source via these steps?
Yes, I skipped to this detail of your question in my first answer.
- Can we run
sssp_sg.cu
version after installing RAPIDS nightly via conda installation?
If you are only interested in calling the functions as is and are on a supported architecture, you could install the conda packages. If you install the conda packages, your environment should contain the necessary headers and libraries already compiled for your environment and you wouldn't need to build from source. I would certainly recommend this, building libcugraph takes a bit of time, and unless you're on a system that we don't build for (e.g. using an older GCC or a Pascal or older GPU) there's not much benefit in building the code yourself.
There's not enough information in your error message for me to suggest what's going wrong. I see the Killed
message in your output. If I had to guess (pure speculation on my part), you may have run out of memory.
We have seen issues where some of our .cu
files require a large amount of host memory for the compiler to run. It's possible that running this on your notebook computer doesn't have sufficient memory to complete compilation. That would be even more motivation to use the pre-built versions.
Sorry, I didn't fully read your original input,
Sure, No worries. Thank you for your replies.
you may have run out of memory.
Ah, I see.
We have seen issues where some of our .cu files require a large amount of host memory for the compiler to run. It's possible that running this on your notebook computer doesn't have sufficient memory to complete compilation.
OMG! Thanks.
That would be even more motivation to use the pre-built versions.
Sure. I'll attempt this.
I see there are a lot of developments happening in this complex repo/intergrations and due to nx-cugraph
All I wished for was to run this BFS example at https://github.com/rapidsai/cugraph/blob/branch-24.06/cpp/examples/users/single_gpu_application/sg_graph_algorithms.cpp
It looked very much like gunrock's style of programming so I got interested in checking it out and learning them.
I think you should be able to build those examples from a conda install of the software. Please let us know if you have any issues, the C++ examples are a new feature we just added in the 24.04 release. Any feedback on making them easier to use would be wonderful.
Any luck on either running from conda installation or building things on a system with more memory?
Any luck on either running from conda installation or building things on a system with more memory?
Unfortunately, on a system with more memory, we encountered NCCL errors (which we have to compile from src or use sudo). We tried using the Conda-installed version (back then, before the 24.04 release) but encountered similar roadblocks. // I'll have to check with the release version.
Any feedback on making them easier to use would be wonderful.
It would be nice to have a lite build system, for example, separating single GPU code vs multi GPU code. i.e. minimal dependency on required -I
files than building the whole of cugraph
It would be nice if the prerequisite section lists about NCCL, cugraphops, etc.
Thank you for all your help and patience. Kind regards,
A thought to try.
We have segregated the SG and MG implementations for many of the algorithms into separate source files. The implementation is generally in a common header file, but the instantiation of the actual functions occurs in separate source files. While we don't have an easy way to skip building the MG code, you could try going into CMakeLists.txt
and commenting out the compilation of all of the source files that have an _mg
suffix (e.g. src/community/louvain_mg.cu
). You'd have to also do that in the tests/CMakeLists.txt
.
That might work, or if you combine that with commenting out the references to NCCL in the two CMakeLists.txt
files you might get a functioning build.
Any luck on this?
If you are using the latest branch (our 24.08 development branch) you will see that we split many of the files into smaller translation units to make the compilation require less memory.
What is your question?
We are interested in running
cpp
/single-gpu
version of SSSP for comparison as baselines in our paper. So, I tried building cugraph from the instructionsI understood that ops is a closed source. So, I even tried from the conda env, which had
lincugraphops
installed, however, that gave a different error withnccl
INCLUDE_DIR vars. Could you please clarify the following?cpp
version usable or buildable atv24.x
? or do we have support only forpy version
?cugraph
from source via these steps?sssp_sg.cu
version after installing RAPIDS nightly via conda installation?Our machine config.
Code of Conduct