Can't update project such that VM is on the same node and network as existing zdb

scottyeager commented 1 week ago

I have a set of deployments that includes some zdbs and also a VM. Sometimes I want to tear down the VM and recreate it on another node. I noticed that if the new node is the same as the nodes used for the zdbs then the new VM fails to deploy.

Note that due to #552, these deployments must be created without mycelium.

Here's an example original project in Python:

import os
import pulumi
import pulumi_threefold as threefold

mnemonic = os.environ["MNEMONIC"]
network = os.environ["NETWORK"]

provider = threefold.Provider("provider", mnemonic=mnemonic, network=network)

node_id1 = 8
node_id2 = 10
network_nodes = [node_id1, node_id2]

network = threefold.Network(
    "network",
    name="test",
    description="test network",
    nodes=network_nodes,
    ip_range="10.1.0.0/16",
    opts=pulumi.ResourceOptions(provider=provider),
)

deployment1 = threefold.Deployment(
    "deployment1",
    node_id=node_id1,
    name="deployment1",
    network_name="test",
    zdbs=[
        threefold.ZDBInputArgs(
            name="zdbsTest",
            size=1,
            password="123456",
        )
    ],
    opts=pulumi.ResourceOptions(provider=provider, depends_on=[network]),
)

deployment2 = threefold.Deployment(
    "deployment2",
    node_id=node_id2,
    name="deployment2",
    network_name="test",
    vms=[
        threefold.VMInputArgs(
            name="vm",
            node_id=node_id2,
            flist="https://hub.grid.tf/tf-official-apps/base:latest.flist",
            entrypoint="/sbin/zinit init",
            network_name="test",
            cpu=1,
            memory=256,  # MB
            env_vars={
                "SSH_KEY": None,
            },
        )
    ],
    opts=pulumi.ResourceOptions(provider=provider, depends_on=[network]),
)

Now here's the updated project, with the VM moved inside the first deployment:

import os
import pulumi
import pulumi_threefold as threefold

mnemonic = os.environ["MNEMONIC"]
network = os.environ["NETWORK"]

provider = threefold.Provider("provider", mnemonic=mnemonic, network=network)

node_id1 = 8
node_id2 = 10
network_nodes = [node_id1]
# network_nodes = [node_id1, node_id2]

network = threefold.Network(
    "network",
    name="test",
    description="test network",
    nodes=network_nodes,
    ip_range="10.1.0.0/16",
    opts=pulumi.ResourceOptions(provider=provider),
)

deployment1 = threefold.Deployment(
    "deployment1",
    node_id=node_id1,
    name="deployment1",
    network_name="test",
    zdbs=[
        threefold.ZDBInputArgs(
            name="zdbsTest",
            size=1,
            password="123456",
        )
    ],
    vms=[
        threefold.VMInputArgs(
            name="vm",
            node_id=node_id2,
            flist="https://hub.grid.tf/tf-official-apps/base:latest.flist",
            entrypoint="/sbin/zinit init",
            network_name="test",
            cpu=1,
            memory=256,  # MB
            env_vars={
                "SSH_KEY": None,
            },
        )
    ],
    opts=pulumi.ResourceOptions(provider=provider, depends_on=[network]),
)

When I try to apply this, I see this error:

    threefold grid provider setup
    error: update failed

  threefold:index:Deployment (deployment1):
    error: could not generate deployments data: 1 error occurred:
        * failed to assign node ips: 2 errors occurred:
        * couldn't calculate networks used ips: 1 error occurred:
        * failed to get used host ids for network test node 8: invalid ip range : invalid CIDR address:

        * invalid ip range : invalid CIDR address:

I also tried it with keeping the second deployment and just changing the node id to match the first:

import os
import pulumi
import pulumi_threefold as threefold

mnemonic = os.environ["MNEMONIC"]
network = os.environ["NETWORK"]

provider = threefold.Provider("provider", mnemonic=mnemonic, network=network)

node_id1 = 8
node_id2 = 8
network_nodes = [node_id1]

network = threefold.Network(
    "network",
    name="test",
    description="test network",
    nodes=network_nodes,
    ip_range="10.1.0.0/16",
    opts=pulumi.ResourceOptions(provider=provider),
)

deployment1 = threefold.Deployment(
    "deployment1",
    node_id=node_id1,
    name="deployment1",
    network_name="test",
    zdbs=[
        threefold.ZDBInputArgs(
            name="zdbsTest",
            size=1,
            password="123456",
        )
    ],
    opts=pulumi.ResourceOptions(provider=provider, depends_on=[network]),
)

deployment2 = threefold.Deployment(
    "deployment2",
    node_id=node_id2,
    name="deployment2",
    network_name="test",
    vms=[
        threefold.VMInputArgs(
            name="vm",
            node_id=node_id2,
            flist="https://hub.grid.tf/tf-official-apps/base:latest.flist",
            entrypoint="/sbin/zinit init",
            network_name="test",
            cpu=1,
            memory=256,  # MB
            env_vars={
                "SSH_KEY": None,
            },
        )
    ],
    opts=pulumi.ResourceOptions(provider=provider, depends_on=[network]),
)

This also fails, but with a different error:

Diagnostics:
  threefold:index:Deployment (deployment2):
    error: failed to revert deployments: error waiting deployment: workload vm within deployment 708032 failed with error: could not get network resource subnet: couldn't load network with id (8Zjj8mJoReynM): open /var/run/cache/networkd/networks/8Zjj8mJoReynM: no such file or directory; try again: error waiting deployment: workload vm within deployment 708031 failed with error: IP 10.1.3.2 is not part of local nr subnet 10.1.2.0/24

  pulumi:pulumi:Stack (cidr-fail-test-test):
    11:21AM INF starting peer session=tf-229893 twin=5545

    threefold grid provider setup
    error: update failed

rawdaGastan commented 1 week ago

you cannot use 2 different nodes one for the deployment and one for the vm

scottyeager commented 5 days ago

you cannot use 2 different nodes one for the deployment and one for the vm

Right, that's a good catch. I don't understand why the node id is specified again on the VM. Wasn't it just derived from the deployment before?

The other case with two separate deployments is really my main concern though.

rawdaGastan commented 5 days ago

Right, that's a good catch. I don't understand why the node id is specified again on the VM. Wasn't it just derived from the deployment before?

yes, that's for if we needed to support multiple nodes per one deployment.

threefoldtech / pulumi-threefold

Can't update project such that VM is on the same node and network as existing zdb #553