Bluefog-Lib / bluefog

Distributed and decentralized training framework for PyTorch over graph
https://bluefog-lib.github.io/bluefog/
Apache License 2.0
291 stars 71 forks source link

MPI run with libcudart.so problem #12

Closed BichengYing closed 4 years ago

BichengYing commented 4 years ago

Running "machine1# mpirun --allow-run-as-root -np 1 -H machine2 data"

mpirun on another machine will prompt "orted: error while loading shared libraries: libcudart.so.10.0: cannot open shared object file: No such file or directory"

Initial guess is the LD_LIBRARY_PATH is not updated.

lucweichen commented 4 years ago

add libcudart.so.10.0 location to LD_LIBRARY_PATH.