dstackai / dstack

dstack is an open-source alternative to Kubernetes, designed to simplify development, training, and deployment of AI across any cloud or on-prem. It supports NVIDIA, AMD, and TPU.
https://dstack.ai/docs
Mozilla Public License 2.0
1.33k stars 98 forks source link

[Feature]: Configuration for master node network selection #1209

Closed TheBits closed 4 months ago

TheBits commented 4 months ago

Problem

A job running in multimode mode must correctly identify the IP address of the master node. A master node instance can have multiple network addresses, and we must select the desired address to using it.

Solution

I suggest adding the --network flag to dstack pool add-ssh command. Using this option, we can specify which IP-network (and IP-address) to use from those available on the master node.

The master node must store the IP address of the desired network. The workers will receive the address of the master node.

Workaround

Multinode now works because the master node only has one IP address for the internal network.

Would you like to help us implement this feature by sending a PR?

Yes

peterschmidt85 commented 4 months ago

Why dstack run? I thought it must be dstack pool add-ssh.

TheBits commented 4 months ago

Oh, yes, you're right.