AI-Hypercomputer / xpk

xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
Apache License 2.0
70 stars 18 forks source link

Create cluster from several reservations #161

Open DwarKapex opened 2 months ago

DwarKapex commented 2 months ago

I wonder if xpk supports a cluster creation from several reservations. If not, do you have any plans to add this feature?

Obliviour commented 1 month ago

Hi @DwarKapex, great question, we don't have a way of supporting this currently.

It might be somewhat supported by iteratively calling the below. But the $DEV type must be the same. I am curious if you see cases where you have multiple reservations within the same zone of capacity?

# create three device type $DEV nps with reservation A
xpk cluster create --reservation=A --num-slices=3 --device-type=$DEV
# create another 2 device type $DEV nps with reservation B
xpk cluster create --reservation=B --num-slices=5 --device-type=$DEV