Hi, I'm trying to run PageRank over two MPI (MPICH v4.0.2) hosts, with two NUMA nodes each. The input graph is very large (>4B vertices) therefore I converted Gemini to use uint64_t vertex IDs. Everything seems to work fine until the locality chunking phase and the subsequent computation of partition offsets.
At that point Gemini fails on the assertion at line 854 in core/graph.hpp, and indeed by adding some debug prints I can see that the two machines have computed different partition offsets for NUMA node 1.
I was able to avoid this failure by adding an MPI_Allreduce call before the MPI_Allreduce call which sets up the global_partition_offset array, which uses MPI_MAX to immediately store the max computed partition offsets directly into the local partition offset array. However I am not sure if this is entirely correct.
Any ideas on a possible cause for the issue (also is my solution correct) ?
Note: I have forked the repo so you can take a look at my modifications.
Hi, I'm trying to run PageRank over two MPI (MPICH v4.0.2) hosts, with two NUMA nodes each. The input graph is very large (>4B vertices) therefore I converted Gemini to use uint64_t vertex IDs. Everything seems to work fine until the locality chunking phase and the subsequent computation of partition offsets.
At that point Gemini fails on the assertion at line 854 in core/graph.hpp, and indeed by adding some debug prints I can see that the two machines have computed different partition offsets for NUMA node 1.
I was able to avoid this failure by adding an MPI_Allreduce call before the MPI_Allreduce call which sets up the global_partition_offset array, which uses MPI_MAX to immediately store the max computed partition offsets directly into the local partition offset array. However I am not sure if this is entirely correct.
Any ideas on a possible cause for the issue (also is my solution correct) ?
Note: I have forked the repo so you can take a look at my modifications.