Open cwpearson opened 4 years ago
That shouldn't be breaking like that. I'll see if I can reproduce this crash.
I was able to reproduce this issue. It should now be fixed on develop.
Hi, thank you. In commit 3c4a1be7b2ff793fabf2b299dfa673e2fae8e86f, I no longer see the crash.
However, what I would like to do is see how the MPI implementation handles types on the GPU, so I am now running
~/software/openmpi-4.0.5/bin/mpirun -n 2 bin/comb 256_256_256 -divide 2_1_1 -comm disable mock -comm enable mpi -exec enable mpi_type -memory disable host -memory enable cuda_device
.
It seems that no actual benchmarks are run, as the output is this:
Comb version 0.2.0
Args bin/comb;256_256_256;-divide;2_1_1;-comm;disable;mock;-comm;enable;mpi;-exec;enable;mpi_type;-memory;disable;host;-memory;enable;cuda_device
Started rank 0 of 2
Node deneb
Compiler "/usr/bin/g++-10"
Cuda compiler "/usr/local/cuda/bin/nvcc"
Cuda driver version 11010
Cuda runtime version 11010
GPU 0 visible undefined
Cart coords 0 0 0
Message policy cutoff 200
Post Recv using wait_all method
Post Send using wait_all method
Wait Recv using wait_all method
Wait Send using wait_all method
Num cycles 5
Num vars 1
ghost_widths 1 1 1
sizes 256 256 256
divisions 2 1 1
periodic 0 0 0
division map
map 0 0 0
map 128 256 256
map 256
Is this configuration supported?
It looks like -cuda_aware_mpi got dropped from the command line.
I dropped it because I interpreted it to mean that it just enabled some assertions and tests, but now I see that the little benchmarks are referred to as "tests" in the outputs.
-cuda_aware_mpi Assert that you are using a cuda aware mpi implementation and enable tests that pass cuda device or managed memory to MPI
In any case, I tried with it on:
$ ~/software/openmpi-4.0.5/bin/mpirun -n 2 bin/comb 256_256_256 -divide 2_1_1 -comm disable mock -comm enable mpi -exec enable mpi_type -memory disable host -memory enable cuda_device -cuda_aware_mpi
Comb version 0.2.0
Args bin/comb;256_256_256;-divide;2_1_1;-comm;disable;mock;-comm;enable;mpi;-exec;enable;mpi_type;-memory;disable;host;-memory;enable;cuda_device;-cuda_aware_mpi
Started rank 0 of 2
Node deneb
Compiler "/usr/bin/g++-10"
Cuda compiler "/usr/local/cuda/bin/nvcc"
Cuda driver version 11010
Cuda runtime version 11010
GPU 0 visible undefined
Cart coords 0 0 0
Message policy cutoff 200
Post Recv using wait_all method
Post Send using wait_all method
Wait Recv using wait_all method
Wait Send using wait_all method
Num cycles 5
Num vars 1
ghost_widths 1 1 1
sizes 256 256 256
divisions 2 1 1
periodic 0 0 0
division map
map 0 0 0
map 128 256 256
map 256
Hi,
I have built this on a system with a single GPU, that I would like to share between two MPI ranks (just for the sake of getting things up and running). The build basically follows the
ubuntu_nvcc10_gcc8
except adjusted for gcc 10. I built commit e06e54d351f7b31177db89f37b4326c8e96656bd (the latest at the time of writing).I tried to run it with the following:
~/software/openmpi-4.0.5/bin/mpirun -n 2 bin/comb 10_10_10 -divide 2_1_1 -cuda_aware_mpi -comm enable mpi -exec enable mpi_type -memory enable cuda_device
but I get the following error:
I also managed ot run the focused tests:
which appears to have worked with the following output:
Is device memory + MPI + MPI_Type a supported configuration at this time? If so, any advice?
Thanks!