AI-Hypercomputer / xpk

xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
Apache License 2.0
81 stars 23 forks source link

Add --tpu-topology flag for specifying custom topology types #57

Open Obliviour opened 10 months ago

Obliviour commented 10 months ago

Fixes / Features

Testing / Documentation

Added a check to make sure the format of topologies fits with the argument expected and the command fails if the format is unexpected.

Obliviour commented 8 months ago

The workload creation needs to know the topology as well so CURRENTLY the user needs to set the same topology for workload create and cluster create. This only adds the flag to cluster create.

In this case, users would have to manage the topology.