astro / microvm.nix

NixOS MicroVMs
https://astro.github.io/microvm.nix/
MIT License
1.25k stars 94 forks source link

Issues exposing services using docker/podman containers inside QEMU MicroVM #203

Closed cryptoluks closed 5 months ago

cryptoluks commented 7 months ago

Hello,

I appreciate the effort that has gone into this project. Having recently transitioned to Nix and NixOS, I am now in the process of migrating my existing Terraform-managed libvirt QEMU machines to QEMU MicroVMs.

But I have issues with exposing Docker or Podman services internally, even to localhost when starting container inside the VM. I only get connection resets. The service itself runs fine (here traefik/whoami on port 80), but the forwarding from 8000 to 80 does not work. It does not matter if accessing with the VM-IP from the Host, or from localhost/VM-IP inside the VM on port 8000.

Are there any requirements for running container regarding packages or kernel modules which are omitted in a stripped-down MicroVM?

Here is the stripped down MicroVM definition. It behaves the same way when using docker instead of podman. I mostly followed the advanced networking docs using the microvm bridge and adding each VM with a tap interface to it:

{ pkgs, ... }: {
  microvm.vms.test = {
    config = {
      system.stateVersion = "23.11";

      microvm = {
        mem = 8192;
        vcpu = 4;

        interfaces = [{
          type = "tap";
          id = "vm-test";
          mac = "02:00:00:00:00:02";
        }];

        shares = [
          {
            source = "/nix/store";
            mountPoint = "/nix/.ro-store";
            tag = "ro-store";
            proto = "virtiofs";
          }
        ];
      };

      systemd.network = {
        enable = true;
        networks = {
          "20-lan" = {
            matchConfig.Type = "ether";
            networkConfig = {
              Address = [ "10.1.0.2/24" ];
              Gateway = "10.1.0.1";
              DNS = [ "1.1.1.1" "8.8.8.8" ];
              DHCP = "no";
            };
          };
        };
      };

      networking.firewall.enable = false;

      virtualisation.oci-containers.containers = {
        whoami = {
          image = "docker.io/traefik/whoami";
          ports = [ "0.0.0.0:8000:80" ];
        };
      };

      services = {
        openssh = {
          enable = true;
          settings.PasswordAuthentication = false;
        };
      };

      users.users.root = {
        isNormalUser = false;
        openssh.authorizedKeys.keys =
          [ "..." ];
      };
    };
  };
}

I tried changing MTU settings, loaded some maybe-required kernel modules like br_netfilter, changed the VM kernel to LTS 6.6 and more without success.

Thank you for taking the time.

astro commented 7 months ago

I am not very experienced with Docker (it's too messy) but your config seems to work for me:

[root@nixos:~]# ss -lntp
State     Recv-Q    Send-Q       Local Address:Port        Peer Address:Port    Process
LISTEN    0         128                0.0.0.0:22               0.0.0.0:*        users:(("sshd",pid=558,fd=3))
LISTEN    0         4096         127.0.0.53%lo:53               0.0.0.0:*        users:(("systemd-resolve",pid=352,fd=21))
LISTEN    0         4096               0.0.0.0:5355             0.0.0.0:*        users:(("systemd-resolve",pid=352,fd=12))
LISTEN    0         4096            127.0.0.54:53               0.0.0.0:*        users:(("systemd-resolve",pid=352,fd=23))
LISTEN    0         4096               0.0.0.0:8000             0.0.0.0:*        users:(("conmon",pid=670,fd=5))
LISTEN    0         128                   [::]:22                  [::]:*        users:(("sshd",pid=558,fd=4))
LISTEN    0         4096                  [::]:5355                [::]:*        users:(("systemd-resolve",pid=352,fd=14))

[root@nixos:~]# nc -vv 127.0.0.1 8000
Connection to 127.0.0.1 8000 port [tcp/irdmi] succeeded!
GET /
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Connection: close

400 Bad Request

Due to a disabled firewall, iptables is not installed. Yet in contrast to docker, podman doesn't seem to setup NAT rules but forwards using a process (conmon).

For trying locally I commented out your specific tap network setup. Did you ensure that connectivity works and the container is successfully fetched?

cryptoluks commented 5 months ago

Was able to solve my issue by setting the container interfaces to unmanaged.

https://github.com/astro/microvm.nix/pull/204 https://astro.github.io/microvm.nix/simple-network.html#docker-and-systemd-network

Neon-44 commented 1 month ago

Having the same Problem. The unmanaged config doesn't solve it to me.

Did you do anything else to solve it?

Neon-44 commented 1 month ago

found not really a fix but a workaround for anyone stumbling on this later on:

extraOptions = ["--network=slirp4netns"];

cryptoluks commented 1 week ago

Hey @Neon-44,

Having the same Problem. The unmanaged config doesn't solve it to me. Did you do anything else to solve it?

Here is the config that worked for me on the host side:

{
  networking.nat = {
    enable = true;
    externalInterface = "eth0";
    internalInterfaces = [ "microvm" ];
  };
  systemd.network = {
    netdevs = {
      "10-microvm".netdevConfig = {
        Kind = "bridge";
        Name = "microvm";
      };
    };
    networks = {
      "10-microvm" = {
        matchConfig.Name = "microvm";
        addresses = [ { addressConfig.Address = "10.0.0.0/24"; } ];
      };
      "11-microvm" = {
        matchConfig.Name = "vm-*";
        networkConfig.Bridge = "microvm";
      };
    };
  };
}

Here is the systemd.network config part in the microvm context:

{
  systemd.network = {
    networks = {
      "19-docker" = {
        matchConfig.Name = "veth*";
        linkConfig = {
          Unmanaged = true;
        };
      };
      "20-lan" = {
        matchConfig.Type = "ether";
        networkConfig = {
          Address = [ "10.0.0.1/24" ];
          Gateway = "10.0.0.0";
          DNS = [
            "1.1.1.1"
            "8.8.8.8"
          ];
          DHCP = "no";
        };
      };
    };
  };
}

Hope this helps.