NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.73k stars 13.86k forks source link

Open vSwitch: System will boot without network if the main port is added as a bridge member #346857

Open hehongbo opened 5 days ago

hehongbo commented 5 days ago

Describe the bug

If the default scripted networking is used and the main outside-facing port is added as a bridge member, with the bridge owning a manually set IP (and gateway+DNS), then the system will boot without network, or more specifically, boot with network interface DOWN, and the bridge is not created.

This can be reproduced, maybe regardless of the type of the Ethernet adapter, or the vendor of it. I reproduced it with an on-board i219, a Mellanox Connect-X3 PCIe card, and an Intel 82599 x520 PCIe card.

This cannot be reproduced with systemd-networkd. Just scripted networking.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Do a normal installation of NixOS with the configuration nixos-generate-config would give, plus:

    networking.interfaces."mybridge" = {
    ipv4 = {
      addresses = [ 
        {
          address = "192.168.0.102";
          prefixLength = 24;
        }
      ];
      routes = [
        {
          address = "0.0.0.0";
          prefixLength = 0;
          via = "192.168.0.1";
        }
      ];
    };
    };
    
    environment.etc."resolv.conf".text = "nameserver 192.168.0.1";
    
    virtualisation.vswitch = {
    enable = true;
    resetOnStart = true;
    };
    
    networking.vswitches.mybridge = {
    interfaces = {
      eno1 = {};
    };
    };

Expected behavior

The system boots with eno1 (as in my example case) UP and is added as a bridge member of mybridge, with IP address set and online.

Screenshots

snapshot-1

Additional context

The important and mysterious part of the issue is, that if I build the system without setting resolv.conf, which leaves networking.resolvconf.enable on given its default value, then this issue will disappear or not reproducible, resulting in a system that everything works except DNS resolving ability being broken.

     };
   };

-  environment.etc."resolv.conf".text = "nameserver 192.168.0.1";
-
   virtualisation.vswitch = {
     enable = true;
     resetOnStart = true;
   };

I wouldn't catch this at all if I remembered to set DNS together with the gateway the first time I tried. I assumed that the existence of the resolvconf service might briefly touch the interface, before the Open vSwitch module "think" there is nothing to add.

Without setting resolv.conf, I can see this briefly during boot.

IMAGE 2024-10-06 18:51:05

Also, without resetOnStart = true; this issue is still reproducible.

Notify maintainers

@netixx @adamcstephens @kmcopper

Metadata


Add a :+1: reaction to issues you find important.

adamcstephens commented 5 days ago

While I maintain the package, I know very little about the module and cannot commit to supporting it. If it works with networkd then I’d recommend sticking with networkd.