Hi,Does ROCM pytorch support distributed training with MPI backend?
Now pytorch can't work with MPI. The error information is as follows:
RuntimeError: CUDA tensor detected and the MPI used doesn't have CUDA-aware MPI support
what's the problem? could you give me some advice. Thanks :)
Hi,Does ROCM pytorch support distributed training with MPI backend? Now pytorch can't work with MPI. The error information is as follows: RuntimeError: CUDA tensor detected and the MPI used doesn't have CUDA-aware MPI support what's the problem? could you give me some advice. Thanks :)