dstack is an open-source alternative to Kubernetes, designed to simplify development, training, and deployment of AI across any cloud or on-prem. It supports NVIDIA, AMD, and TPU.
A job running in multimode mode must correctly identify the IP address of the master node. A master node instance can have multiple network addresses, and we must select the desired address to using it.
Solution
I suggest adding the --network flag to dstack pool add-ssh command. Using this option, we can specify which IP-network (and IP-address) to use from those available on the master node.
The master node must store the IP address of the desired network. The workers will receive the address of the master node.
Workaround
Multinode now works because the master node only has one IP address for the internal network.
Would you like to help us implement this feature by sending a PR?
Problem
A job running in multimode mode must correctly identify the IP address of the master node. A master node instance can have multiple network addresses, and we must select the desired address to using it.
Solution
I suggest adding the
--network
flag todstack pool add-ssh
command. Using this option, we can specify which IP-network (and IP-address) to use from those available on the master node.The master node must store the IP address of the desired network. The workers will receive the address of the master node.
Workaround
Multinode now works because the master node only has one IP address for the internal network.
Would you like to help us implement this feature by sending a PR?
Yes