-
I follow the code instruction and run my gpt2 model with bagua.
1 node 2 gpus. But I got this error.
My code works on pure pytorch distribution environment.
Here is the source code:
```
self.mod…
-
-
here: https://github.com/BaguaSys/tutorials
rendered version: https://baguasys.github.io/tutorials/kubernetes-integration/index.html
-
For easier benchmarking.
-
As mentioned here: https://github.com/NVIDIA/nccl/issues/450#issuecomment-758082610
-
Currently there are only `broadcast` and `allreduce`: https://bagua.readthedocs.io/en/latest/autoapi/bagua/torch_api/communication/index.html
- [x] We need a python test script to test all primitiv…
-
-
-
https://github.com/BaguaSys/bagua/blob/96cb6fe72dfcb2d0394e465291a31aff1f3e0142/bagua/torch_api/distributed.py#L276-L281
With `bagua-core` 0.3, we no longer need to flatten all at once. Now we can …
-
I see example only with PyTorch