Closed sg0 closed 2 months ago
First thing I would check is to make sure that you're getting what you asked for.
Try adding an nvidia-smi
call before calling mpirun in order to show the allocation you're getting.
You might also set the following environment variable:
export NCCL_DEBUG=TRACE
This will litter your output file with debugging messages, but it looks like there might be a comms issue (it's failing in a all gather).
Attaching the error and output files with the debug info.
CUGXX_6681940_4294967294.err.txt CUGXX_6681940_4294967294.out.txt
Thanks, I will review.
As you may have found out, this issue is happening at the graph distribution phase, and it might be due to a GPU not having a partition since the test graph is relatively small (this is easy to fix, just throw an exception if a GPU has empty buffers). Running this on 2 GPUs works at my end, with this error:
*** The MPI_Comm_free() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[a100-04:29501] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Comm_free() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[a100-04:29500] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
This is probably because MPI_Comm_free
is invoked in the handle destructor, which gets called after MPI_Finalize
, leading to memory leak.
Another related question - what is the easiest way to read and distribute matrix market files (from SuiteSparse collection) to use in C++ MG graph codes? Is there a function in cugraph utilities that can be used?
Sorry, I had not had a chance to look through your logs last week.
These tests were written to get folks started in calling our C++ code directly. I think you have identified a couple of edge conditions that aren't being handled properly in these tests.
About these specific issues:
MPI_Finalize()
. If you add a handle.reset()
before the call to MPI_Finalize()
that should resolve the problem. I will update our code.Regarding reading matrix market files... we have functions within our test suite that can be used for this:
There's no fundamental difference (you can look in the code). If you want to tweak the edge list in some way before creating the graph you probably should use the first, otherwise the second is less code to manage.
These are less than optimal (we only use them for testing). The biggest issue is that each GPU reads the entire MTX file and then filters out the subset it cares about. That means that you need sufficient GPU memory on each node to contain the entire edge list. I created a function somewhere (never merged it into the code base) that would read a different block of data on each GPU to do the parsing and then shuffle the parsed data to the proper GPU. You could adapt the code I linked to have each GPU read the file in blocks and filter the edges that are relevant there. That would let you manage the memory size but still would result in duplicate computations.
Thanks, even after invoking handle.reset
before MPI_Finalize
, I am getting some errors from MPI, but it's not on the critical path. I am closing the issue.
What is your question?
This is the second part of: https://github.com/rapidsai/cugraph/issues/4596
I am trying to run the multi-GPU test (https://github.com/rapidsai/cugraph/blob/branch-24.10/cpp/examples/users/multi_gpu_application/mg_graph_algorithms.cpp) on a single node, this is my job script:
Encountering segfault (from every process):
Please advise; the platform OpenMPI is CUDA-aware:
Code of Conduct