kubeflow / training-operator

Distributed ML Training and Fine-Tuning on Kubernetes
https://www.kubeflow.org/docs/components/training
Apache License 2.0
1.51k stars 660 forks source link

Support ARM64 platform in PyTorch examples #2111

Closed tenzen-y closed 1 month ago

tenzen-y commented 1 month ago

Currently, we support only the AMD64 platform in all PyTorch examples. However, some users use ARM64 machines such as Macbooks. So, supporting the ARM64 platform allows users to try the training-operator examples on their machines easily.

Contributors should support the following examples by adding the platform parameters here: https://github.com/kubeflow/training-operator/blob/e31d11faa9f6ce5111b60c01079d39295589e0ef/.github/workflows/publish-example-images.yaml#L24

/good-first-issue

google-oss-prow[bot] commented 1 month ago

@tenzen-y: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to [this](https://github.com/kubeflow/training-operator/issues/2111): >Currently, we support only the AMD64 platform in all PyTorch examples. >However, some users use ARM64 machines such as Macbooks. So, supporting the ARM64 platform allows users to try the training-operator examples on their machines easily. > >Contributors should support the following examples by adding the platform parameters here: https://github.com/kubeflow/training-operator/blob/e31d11faa9f6ce5111b60c01079d39295589e0ef/.github/workflows/publish-example-images.yaml#L24 > >- [ ] examples/pytorch/cpu-demo/Dockerfile >- [ ] examples/pytorch/elastic/imagenet/Dockerfile >- [ ] examples/pytorch/elastic/echo/Dockerfile >- [ ] examples/pytorch/mnist/Dockerfile >- [ ] examples/pytorch/mnist/Dockerfile-mpi >- [ ] examples/pytorch/smoke-dist/Dockerfile > >/good-first-issue > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
danielsuh05 commented 1 month ago

Hey! Interested in this project, could I be assigned to this issue?

tenzen-y commented 1 month ago

Hey! Interested in this project, could I be assigned to this issue?

Sure, feel free to assign yourself with /assign comment.

danielsuh05 commented 1 month ago

/assign