aicoe-kaggle / diabetic-retinopathy

Other
0 stars 0 forks source link

Test cluster connectivity #6

Open TreeinRandomForest opened 3 years ago

TreeinRandomForest commented 3 years ago

PyTorch can use a variety of backends including MPI. See: https://pytorch.org/tutorials/beginner/dist_overview.html

Experiment: Use low-level api (blocking send and recv per https://pytorch.org/tutorials/intermediate/dist_tuto.html) to send a message i.e. an integer i from rank i % N -> rank i + 1 % N where N = number of workers ("WORLD_SIZE") and print to log files.