Closed Mon-ius closed 6 months ago
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
View this failed invocation of the CLA check for more information.
For the most up to date status, view the checks section at the bottom of the pull request.
Sounds good. I also prefer simple and less dependencies.
Have you test the code using run_xla.py
? You can follow the instructions in the "Try It out with PyTorch/XLA" section in README.md.
Btw, since you are here, you can also remove all the redundant dependencies in dockerfiles. https://github.com/google/gemma_pytorch/tree/main/docker
Thanks
Absolutely! I tested both on TPUv4-8
x 8 and A100
GPU x8
Done 🤗
Maybe also remove dependencies in Dockerfile
?
https://github.com/google/gemma_pytorch/blob/main/docker/Dockerfile
my mistake, should be all done 🤗
Awesome, thanks for the contribution. Merging it now.
Since only implementation of fairescale is
from fairscale.nn.model_parallel.utils import divide_and_check_no_remainder, split_tensor_along_last_dim
where can just be migrate as :It makes upon build more simpler and cleaner 🤗