Closed BichengYing closed 4 years ago
Done. Introduced two env variables: BLUEFOG_WIN_ON_CPU=1 BLUEFOG_OPS_ON_CPU=1 if those are set, all win ops and other ops will transfer the torch cuda into cpu, then communicate cpu vector through mpi, last transform back to cuda.
Since there are lots of issues about running the win_ops on the GPU-aware mpi, it is necessary to allow to run win_ops through the CPU communication