Bluefog-Lib / bluefog

Distributed and decentralized training framework for PyTorch over graph
https://bluefog-lib.github.io/bluefog/
Apache License 2.0
291 stars 71 forks source link

Profiling the GPU memory usage in neighbor_ops versus win_ops #9

Open BichengYing opened 4 years ago

BichengYing commented 4 years ago

It is known that win_ops will duplicate the parameters and consume more GPU memories. However, memory usage is not clear during the runtime. We need tools to have an accurate number.

Bluefog-Lib commented 4 years ago

Based on observation so far, we didn't observe much extra GPU memory usage in neighbor ops, win_ops versus allreduce