lxc / incus

Powerful system container and virtual machine manager
https://linuxcontainers.org/incus
Apache License 2.0
2.5k stars 203 forks source link

`incus stop <instance>` does not seem to release `nictype=physical` devices #385

Closed mcondarelli closed 8 months ago

mcondarelli commented 8 months ago

Required information

Issue description

I have a WiFi adapter (USB) I pass over to a container (standard images:openwrt/23.05). First time everything is fine but if I stop container adapter is not freed and any attempt to reuse it fails. See below for a commented log. Attatched incusd.log reports it was unable to detach adapter.

Steps to reproduce

## server is rebooted
mcon@cinderella:~$ ssh root@incus reboot

## WiFi card (wlx00c0cab2bd76) is visible
mcon@cinderella:~$ ssh root@incus -- ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 60:02:92:57:66:c1 brd ff:ff:ff:ff:ff:ff
3: enxa0cec8b43055: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether a0:ce:c8:b4:30:55 brd ff:ff:ff:ff:ff:ff
4: wlx00c0cab2bd76: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:c0:ca:b2:bd:76 brd ff:ff:ff:ff:ff:ff
5: ORANGE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:16:3e:ed:dd:b3 brd ff:ff:ff:ff:ff:ff
6: incusbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:16:3e:54:c8:5c brd ff:ff:ff:ff:ff:ff
8: vethee6b197a@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master incusbr0 state UP mode DEFAULT group default qlen 1000
    link/ether de:d4:99:67:26:1f brd ff:ff:ff:ff:ff:ff link-netnsid 0
10: vethae9bd16d@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master incusbr0 state UP mode DEFAULT group default qlen 1000
    link/ether ae:89:33:8b:97:07 brd ff:ff:ff:ff:ff:ff link-netnsid 1
12: vethe3470c1f@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ORANGE state UP mode DEFAULT group default qlen 1000
    link/ether ba:4b:6d:be:79:49 brd ff:ff:ff:ff:ff:ff link-netnsid 2

## this is instance configuration, notice `wlan0` device
mcon@cinderella:~$ incus config show incus:openwrt --expanded
architecture: x86_64
config:
  image.architecture: amd64
  image.description: Openwrt 23.05 amd64 (20240113_11:57)
  image.os: Openwrt
  image.release: "23.05"
  image.serial: "20240113_11:57"
  image.type: squashfs
  image.variant: default
  volatile.base_image: 632ab890e8c9ec28ccd47b971f64e0dda98c235687f5cfc2486a30a6b231139f
  volatile.cloud-init.instance-id: cd96c3f7-dee9-4d38-9006-927fa8cd0fca
  volatile.eth0.hwaddr: 00:16:3e:f2:5a:6f
  volatile.eth0.name: eth0
  volatile.eth1.hwaddr: 00:16:3e:57:99:42
  volatile.eth1.name: eth1
  volatile.eth2.hwaddr: 00:16:3e:ed:92:d6
  volatile.eth2.name: eth2
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"IsUID":true,"IsGID":false,"HostID":1000000,"NSID":0,"MapRange":1000000000},{"IsUID":false,"IsGID":true,"HostID":1000000,"NSID":0,"MapRange":1000000000}]'
  volatile.idmap.next: '[{"IsUID":true,"IsGID":false,"HostID":1000000,"NSID":0,"MapRange":1000000000},{"IsUID":false,"IsGID":true,"HostID":1000000,"NSID":0,"MapRange":1000000000}]'
  volatile.last_state.idmap: '[{"IsUID":true,"IsGID":false,"HostID":1000000,"NSID":0,"MapRange":1000000000},{"IsUID":false,"IsGID":true,"HostID":1000000,"NSID":0,"MapRange":1000000000}]'
  volatile.last_state.power: STOPPED
  volatile.uuid: 9bfb30dd-4ac8-4d48-83f3-acb208038603
  volatile.uuid.generation: 9bfb30dd-4ac8-4d48-83f3-acb208038603
  volatile.wlan0.host_name: wlx00c0cab2bd76
  volatile.wlan0.last_state.created: "false"
  volatile.wlan0.last_state.hwaddr: 00:c0:ca:b2:bd:76
  volatile.wlan0.last_state.mtu: "1500"
devices:
  eth0:
    nictype: macvlan
    parent: enp2s0
    type: nic
  eth1:
    nictype: macvlan
    parent: enxa0cec8b43055
    type: nic
  eth2:
    network: ORANGE
    type: nic
  root:
    path: /
    pool: default
    type: disk
  wlan0:
    name: wlan0
    nictype: physical
    parent: wlx00c0cab2bd76
    type: nic
ephemeral: false
profiles:
- default
stateful: false
description: OpenWrt firewall/router

## instance starts normally
mcon@cinderella:~$ incus start incus:openwrt

## after a while I stop it
mcon@cinderella:~$ incus stop incus:openwrt

## and after a few minutes I try to restrt it
mcon@cinderella:~$ incus start incus:openwrt
Error: Failed to start device "wlan0": Parent device 'wlx00c0cab2bd76' doesn't exist
Try `incus info --show-log incus:openwrt` for more info

## It fails and it is right because device is not visible anymore (presumably it was not dettched on `incus stop`
mcon@cinderella:~$ ssh root@incus -- ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 60:02:92:57:66:c1 brd ff:ff:ff:ff:ff:ff
3: enxa0cec8b43055: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether a0:ce:c8:b4:30:55 brd ff:ff:ff:ff:ff:ff
5: ORANGE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:16:3e:ed:dd:b3 brd ff:ff:ff:ff:ff:ff
6: incusbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:16:3e:54:c8:5c brd ff:ff:ff:ff:ff:ff
8: vethee6b197a@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master incusbr0 state UP mode DEFAULT group default qlen 1000
    link/ether de:d4:99:67:26:1f brd ff:ff:ff:ff:ff:ff link-netnsid 0
10: vethae9bd16d@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master incusbr0 state UP mode DEFAULT group default qlen 1000
    link/ether ae:89:33:8b:97:07 brd ff:ff:ff:ff:ff:ff link-netnsid 1
12: vethe3470c1f@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ORANGE state UP mode DEFAULT group default qlen 1000
    link/ether ba:4b:6d:be:79:49 brd ff:ff:ff:ff:ff:ff link-netnsid 2
mcon@cinderella:~$ 

Information to attach

stgraber commented 8 months ago

Wireless devices are a bit weird and are comprised of two devices, the phy and the interface. Most users only ever notice the interface which is what you interact with.

When moving wireless devices into a container, both the phy and interface are moved in. When the instance dies, the Linux kernel attempts to delete whatever can be deleted and move back the rest. In this case, it likely means that the interface gets deleted and the phy is moved back, which doesn't really help you.

I need to find a USB wifi adapter or something that I can use to poke at this, hopefully it'd be possible to add pre-stop logic to relocate both the phy and interface to the host system prior to the kernel deleting everything.

mcondarelli commented 8 months ago

Let me know if I can help in any way.

stgraber commented 8 months ago

I've not been able to reproduce the issue here:

stgraber@dakara:~$ ip link | grep wl
3: wlp57s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DORMANT group default qlen 1000
31: wlxb44bd62a1d17: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DORMANT group default qlen 1000
stgraber@dakara:~$ incus config device add u1 wlan0 nic nictype=physical name=wlan0 parent=wlxb44bd62a1d17
Device wlan0 added to u1
stgraber@dakara:~$ incus start u1
stgraber@dakara:~$ ip link | grep wl
3: wlp57s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DORMANT group default qlen 1000
stgraber@dakara:~$ incus exec u1 -- ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
31: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DORMANT group default qlen 1000
    link/ether 9e:cb:30:7c:da:62 brd ff:ff:ff:ff:ff:ff permaddr b4:4b:d6:2a:1d:17
47: eth0@if48: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:16:3e:4d:51:51 brd ff:ff:ff:ff:ff:ff link-netnsid 0
stgraber@dakara:~$ incus stop u1
stgraber@dakara:~$ ip link | grep wl
3: wlp57s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DORMANT group default qlen 1000
31: wlxb44bd62a1d17: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DORMANT group default qlen 1000
stgraber@dakara:~$ 
stgraber commented 8 months ago

Can you confirm that the interface inside the container hasn't been renamed? I could see how having the interface renamed from Incus' expected name (wlan0) to something else could prevent it from being "rescued" ahead of shutdown.

stgraber commented 8 months ago

Also can you confirm that you have the iw tool installed on your system?

mcondarelli commented 8 months ago

I can confirm I have iw installed. Interface seems to have been renamed somehow; this is what I see in container:

~ # ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: phy4-ap0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN qlen 1000
    link/ether 00:c0:ca:b2:bd:76 brd ff:ff:ff:ff:ff:ff
29: eth0@phy4-ap0: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 00:16:3e:f2:5a:6f brd ff:ff:ff:ff:ff:ff
30: eth1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 00:16:3e:57:99:42 brd ff:ff:ff:ff:ff:ff
31: eth2@if32: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 00:16:3e:ed:92:d6 brd ff:ff:ff:ff:ff:ff
~ # 

Output of iw list (if useful) is:

~ # iw list
Wiphy phy4
    wiphy index: 4
    max # scan SSIDs: 4
    max scan IEs length: 2243 bytes
    max # sched scan SSIDs: 0
    max # match sets: 0
    Retry short limit: 7
    Retry long limit: 4
    Coverage class: 0 (up to 0m)
    Device supports AP-side u-APSD.
    Device supports T-DLS.
    Available Antennas: TX 0x1 RX 0x1
    Configured Antennas: TX 0x1 RX 0x1
    Supported interface modes:
         * IBSS
         * managed
         * AP
         * AP/VLAN
         * monitor
         * mesh point
         * P2P-client
         * P2P-GO
    Band 1:
        Capabilities: 0x17e
            HT20/HT40
            SM Power Save disabled
            RX Greenfield
            RX HT20 SGI
            RX HT40 SGI
            RX STBC 1-stream
            Max AMSDU length: 3839 bytes
            No DSSS/CCK HT40
        Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
        Minimum RX AMPDU time spacing: No restriction (0x00)
        HT TX/RX MCS rate indexes supported: 0-7
        Frequencies:
            * 2412 MHz [1] (1.0 dBm)
            * 2417 MHz [2] (1.0 dBm)
            * 2422 MHz [3] (1.0 dBm)
            * 2427 MHz [4] (2.0 dBm)
            * 2432 MHz [5] (2.0 dBm)
            * 2437 MHz [6] (2.0 dBm)
            * 2442 MHz [7] (2.0 dBm)
            * 2447 MHz [8] (2.0 dBm)
            * 2452 MHz [9] (2.0 dBm)
            * 2457 MHz [10] (2.0 dBm)
            * 2462 MHz [11] (2.0 dBm)
            * 2467 MHz [12] (2.0 dBm) (no IR)
            * 2472 MHz [13] (2.0 dBm) (no IR)
            * 2484 MHz [14] (2.0 dBm) (no IR)
    Band 2:
        Capabilities: 0x17e
            HT20/HT40
            SM Power Save disabled
            RX Greenfield
            RX HT20 SGI
            RX HT40 SGI
            RX STBC 1-stream
            Max AMSDU length: 3839 bytes
            No DSSS/CCK HT40
        Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
        Minimum RX AMPDU time spacing: No restriction (0x00)
        HT TX/RX MCS rate indexes supported: 0-7
        VHT Capabilities (0x31800120):
            Max MPDU length: 3895
            Supported Channel Width: neither 160 nor 80+80
            short GI (80 MHz)
            RX antenna pattern consistency
            TX antenna pattern consistency
        VHT RX MCS set:
            1 streams: MCS 0-9
            2 streams: not supported
            3 streams: not supported
            4 streams: not supported
            5 streams: not supported
            6 streams: not supported
            7 streams: not supported
            8 streams: not supported
        VHT RX highest supported: 0 Mbps
        VHT TX MCS set:
            1 streams: MCS 0-9
            2 streams: not supported
            3 streams: not supported
            4 streams: not supported
            5 streams: not supported
            6 streams: not supported
            7 streams: not supported
            8 streams: not supported
        VHT TX highest supported: 0 Mbps
        VHT extended NSS: not supported
        Frequencies:
            * 5180 MHz [36] (15.0 dBm) (no IR)
            * 5200 MHz [40] (15.0 dBm) (no IR)
            * 5220 MHz [44] (15.0 dBm) (no IR)
            * 5240 MHz [48] (15.0 dBm) (no IR)
            * 5260 MHz [52] (14.0 dBm) (no IR, radar detection)
            * 5280 MHz [56] (14.0 dBm) (no IR, radar detection)
            * 5300 MHz [60] (15.0 dBm) (no IR, radar detection)
            * 5320 MHz [64] (15.0 dBm) (no IR, radar detection)
            * 5500 MHz [100] (13.0 dBm) (no IR, radar detection)
            * 5520 MHz [104] (12.0 dBm) (no IR, radar detection)
            * 5540 MHz [108] (11.0 dBm) (no IR, radar detection)
            * 5560 MHz [112] (11.0 dBm) (no IR, radar detection)
            * 5580 MHz [116] (11.0 dBm) (no IR, radar detection)
            * 5600 MHz [120] (11.0 dBm) (no IR, radar detection)
            * 5620 MHz [124] (11.0 dBm) (no IR, radar detection)
            * 5640 MHz [128] (11.0 dBm) (no IR, radar detection)
            * 5660 MHz [132] (11.0 dBm) (no IR, radar detection)
            * 5680 MHz [136] (11.0 dBm) (no IR, radar detection)
            * 5700 MHz [140] (11.0 dBm) (no IR, radar detection)
            * 5720 MHz [144] (12.0 dBm) (no IR, radar detection)
            * 5745 MHz [149] (12.0 dBm) (no IR)
            * 5765 MHz [153] (12.0 dBm) (no IR)
            * 5785 MHz [157] (12.0 dBm) (no IR)
            * 5805 MHz [161] (12.0 dBm) (no IR)
            * 5825 MHz [165] (12.0 dBm) (no IR)
            * 5845 MHz [169] (disabled)
            * 5865 MHz [173] (disabled)
    valid interface combinations:
         * #{ IBSS } <= 1, #{ managed, AP, mesh point, P2P-client, P2P-GO } <= 2,
           total <= 2, #channels <= 1, STA/AP BI must match
    HT Capability overrides:
         * MCS: ff ff ff ff ff ff ff ff ff ff
         * maximum A-MSDU length
         * supported channel width
         * short GI for 40 MHz
         * max A-MPDU length exponent
         * min MPDU start spacing
    max # scan plans: 1
    max scan plan interval: -1
    max scan plan iterations: 0
    Supported extended features:
        * [ VHT_IBSS ]: VHT-IBSS
        * [ RRM ]: RRM
        * [ FILS_STA ]: STA FILS (Fast Initial Link Setup)
        * [ CQM_RSSI_LIST ]: multiple CQM_RSSI_THOLD records
        * [ CONTROL_PORT_OVER_NL80211 ]: control port over nl80211
        * [ TXQS ]: FQ-CoDel-enabled intermediate TXQs
        * [ SCAN_RANDOM_SN ]: use random sequence numbers in scans
        * [ SCAN_MIN_PREQ_CONTENT ]: use probe request with only rate IEs in scans
        * [ AIRTIME_FAIRNESS ]: airtime fairness scheduling
        * [ AQL ]: Airtime Queue Limits (AQL)
        * [ CONTROL_PORT_NO_PREAUTH ]: disable pre-auth over nl80211 control port support
        * [ DEL_IBSS_STA ]: deletion of IBSS station support
        * [ SCAN_FREQ_KHZ ]: scan on kHz frequency support
        * [ CONTROL_PORT_OVER_NL80211_TX_STATUS ]: tx status for nl80211 control port support
~ # 
stgraber commented 8 months ago

Okay, so that's likely the problem then.

If it always gets renamed to the same thing in the container, adjusting the name value in the container config to match that name should prevent the deletion on stop.

So based on the above, that would be phy4-ap0.

stgraber commented 8 months ago

If you can confirm that things work properly when the name lines up between your configuration and the name inside of the container prior to it being stopped, then we can close this issue.

mcondarelli commented 8 months ago

Thanks @stgraber, unfortunately I cannot confirm your analysis. There's something more going on on my side:

It seems to me renaming is done by Incus machinery and not by any artifact in my VM/Container, but I might be very wrong.

In my specific case this is not a huge problem since this Incus server is meant to handle just 3 instances and to stay up 24/7; "inconvenience" is just in the present setup/trimming stage where I need to reboot the whole server between tries, but it seems to point to some non-trivial inconsistency inside Incus.

Just tell me if you want to further pursue the issue or if I have to live with it.

stgraber commented 8 months ago

@mcondarelli can you try passing the wireless card to a basic Ubuntu container instead? I'd like to see if the same issue happens when using a container and configuration similar to the one I tested here.

mcondarelli commented 8 months ago

@stgraber I have somewhat mixed results with plain Ubuntu:

Should I try adding also all other NICs?

stgraber commented 8 months ago

Ah, interesting, so yeah, it does look like with Ubuntu (so without things getting renamed in the container) that things get restored properly.

The failure messages are interesting though, I'll have to look at that logic a bit.

mcondarelli commented 8 months ago

I will test on my "sensitive" setup as soon as it propagates to "daily".