amazonlinux / amazon-ec2-net-utils

ec2-net-utils contains a set of utilities for managing elastic network interfaces on Amazon EC2.
Apache License 2.0
87 stars 54 forks source link

Use systemd.device units instead of running `systemctl enable`/`disable` #112

Open commiterate opened 1 week ago

commiterate commented 1 week ago

Description

systemd supports device units (systemd.device) which are dynamically created/deleted/reloaded on device hot-attach/hot-detach/change.

Update the udev rules and systemd unit files to use this instead of systemctl enable/disable.

Background

The current udev rules create/delete + start/stop systemd units by running systemctl enable/disable --now on ENI hot-attach/detach.

This is undesirable for a few reasons:

  1. It doesn't work in distributions with an immutable /etc/systemd/system (e.g. NixOS).
    • systemctl enable/disable creates/deletes symlinks to systemd unit files elsewhere on the system (e.g. /lib/systemd/system) in /etc/systemd/system.
  2. Interface-specific units are started automatically on reboot by systemd without input from udev.
    • If an instance is stopped, an ENI is detached, and the instance is started, there may be zombie service and timer units until udev emits a remove event (if it does it at all. It might not if there's a power outage or the system crashes and the ENI is removed before the subsequent boot).

Even without switching to systemd.device units, the udev rules should be doing systemctl start/stop instead of systemctl enable/disable.

See https://github.com/NixOS/nixpkgs/pull/355111#discussion_r1837045051 for a more detailed discussion.

Notes

arianvp commented 1 week ago

i.e. idiomatic example would be:

SUBSYSTEM=="net", ACTION=="add", ENV{ID_NET_DRIVER}=="vif|ena|ixgbevf", SYSTEMD_WANTS="policy-routes@.service refresh-policy-routes@.timer"

And then add this to the timer and service

[Unit]
Description=Set up policy routes for %I
BindsTo=%i.device # stops unit if the device goes away
After=%i.device
[Service]
ExecStart=policy-routes %I

The script would then get the full sysfs path as an argument (e.g. /sys/devices/pci0000:00/0000:00:05.0/net/ens5) instead of just the device name so the scripts need to be adjusted slightly

Systemd already adds TAG+=systemd to network devices so this should just work

arianvp commented 1 week ago

It doesn't work in distributions with an immutable /etc/systemd/system (e.g. NixOS). systemctl enable/disable creates/deletes symlinks to systemd unit files elsewhere on the system (e.g. /lib/systemd/system) in /etc/systemd/system.

Because the unit files have nothing in their [Install] section I think this shouldnt be a problem. Basically systemctl enable/disable --now just act as systemctl start/stop because of it

commiterate commented 1 week ago

That happens to be the case today, but it should probably be fixed so it doesn't become a bug in the future (if it isn't already one today with ENI detach while the instance is stopped).