AI-Hypercomputer / xpk

xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
Apache License 2.0
81 stars 23 forks source link

Add device type and tpu type argument to cacheimage #55

Closed Obliviour closed 10 months ago

Obliviour commented 10 months ago

This sets us up for multiaccelerator support in the same cluster. This also fixes the broken cacheimage command currently.

 python3 xpk.py cluster cacheimage --cluster CLUSTER --docker-image IMAGE --zone=ZONE --tpu-type v5litepod-16

Fixes / Features

Testing / Documentation

Ran xpk cacheimage and it works now.