canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 931 forks source link

LXD 5.0.0-9a607f6 fails to restart ephemeral instances #10250

Closed paride closed 2 years ago

paride commented 2 years ago

Required information

Issue description

LXD isn't able to restart ephemeral containers anymore. This used to work fine with LXD 4.x, same host system.

Steps to reproduce

paride@diglett:~$ lxc launch ubuntu:bionic paride-b-ephemeral --ephemeral
Creating paride-b-ephemeral
Starting paride-b-ephemeral                 
paride@diglett:~$ lxc restart paride-b-ephemeral
Error: Failed to create instance update operation: Instance is busy running a "restart" operation

Information to attach

lxd info

paride@diglett:~$ lxc info
config:
  core.https_address: '[::]'
  core.trust_password: true
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses:
  - 10.245.168.20:8443
  - 10.109.225.1:8443
  - '[fd42:1b7e:739f:50c0::1]:8443'
  - 192.168.122.1:8443
  - 172.17.0.1:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIIFSjCCAzKgAwIBAgIRAIqNRgk6JiVdB08oha7hMlIwDQYJKoZIhvcNAQELBQAw
    NTEcMBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEVMBMGA1UEAwwMcm9vdEBk
    aWdsZXR0MB4XDTE4MDYxNDE2MDY1NVoXDTI4MDYxMTE2MDY1NVowNTEcMBoGA1UE
    ChMTbGludXhjb250YWluZXJzLm9yZzEVMBMGA1UEAwwMcm9vdEBkaWdsZXR0MIIC
    IjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAqMMxb5zfQa0vXUoZ1DLRS3XK
    FJuuC8V6lAxKlcHw6riLDZDj4SJ/ucooEKtwHFvUXi/VyTWDNjle87lImwRML/Ud
    wOymRgb3kthgmoR22WhuhLdp+V2D5wEKicEcT/EVDcrIKtczz4NVkBsb7YXWn2vH
    YFaQTDR7DwOW25hZmll079GHAHLhldpO15YXI3GF5amGVhAHlGXRoa95CdEuWvZV
    nKt3Gb/CceMvactjCRffNK/Hn2XfO5m/HFk092yoTO+z6u5L0uxOnIYAxB5aQSKX
    4nSS62BOqduiiLysETsEYdgN5r4drsXZoU9DW0i8f4vOtMuQf4QHFE+Z/g/ldVpr
    9KyI3R6xMBnPbQ2EamUYsUleEleOV3272FzsTb9nJKl5+rHuRcVoAH3rmxJfWOZk
    fKm4ag/wkfYbT3Z3S3XDX2m1tguH2wCMNZMOwh8llrQlow3E3EE31HvzN7Ep9NaS
    ZOKet+o+jjT/PvvwZi97bAGAoL7/RGOoHREvIIWEeiczvGZmoPv3sY1f6Q5Sr6zT
    fH3x5xWSizmzDSJ2ydSbKEedbJqxh+KG8Lf0kEKRBADDnZTAfgy0VTtfYpugBSpL
    +ZMB6Dj42s0yLGFJBBomhlCFBy4B5fTQlnxROD9k+C2f0qKNCs+MBxGKF6v7zFq1
    5ZK/7q00CuS2sgb5XwsCAwEAAaNVMFMwDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQM
    MAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwHgYDVR0RBBcwFYIHZGlnbGV0dIcE
    CvWoFIcEwKh6ATANBgkqhkiG9w0BAQsFAAOCAgEAl2+HaoYsayRyqAx9ZBtLc9C4
    VZJofRRxrO2IzxYm2SRLugmwS3abU3J5xjk3WJUJjDcnmTVrwEurWnoxD/VljHt0
    YQTCWj6Fuk9X3TlsHjRmqo109limH0VOn//xZ3dtI3IFnOYzYjGMLX0QxkTJk95o
    CtzHhT5FXxO2FDibRLIUVlJv9ZXMdFM3mxGy/I16ktHMjI+HVad4uXwL/7ZcA9u3
    X8on8iun01YTtKozKPM6DSjTh0QR6kAqtroPeGPcxiCAVwQV5yB5wIuglf0tXZdU
    YTScUlbrYmvGKvhVj363sFLnnZO7/SN65564Rw1T8Mto5J9z7u5/3fa31UcDmvYf
    6v/QeHXegCoDANGWNL2ZuYlU5/xUSDa30LERJZFg12LS43e1VPikrwOomfWyf0At
    /saRlSop1/9E4Ez+LZh4pI5D0VjClJs5901SfSlumNEfOJHCnE6Eeg5MKU5YORLA
    Grjif62zmROqkcb4xNFz/jrTnoSECo4Ypbq1PSBW1n6bD26Ml6gaf2TGKrYMPaCd
    r8YZ/n/UOIZRuTqsHcuB4NbeWL11390gX0elDNNxEY1G+anLEFgfQh+TWVGMr6Qk
    ASvFnsLiqKakm94ust7i8P0qs1n8xAxrOGLNChtS7kjC2+y7a4plCyCVz59KQMa5
    ZOZsAgwmZ69aOySDxSE=
    -----END CERTIFICATE-----
  certificate_fingerprint: 8a9fd84e11b5d47d094948343c45b1875adccc1d48073c6d067019e963c224cf
  driver: lxc | qemu
  driver_version: 4.0.12 | 6.1.1
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    shiftfs: "false"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 5.15.0-25-generic
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Ubuntu
  os_version: "22.04"
  project: default
  server: lxd
  server_clustered: false
  server_event_mode: full-mesh
  server_name: diglett
  server_pid: 46452
  server_version: 5.0.0
  storage: zfs
  storage_version: 2.1.2-1ubuntu3
  storage_supported_drivers:
  - name: ceph
    version: 15.2.14
    remote: true
  - name: btrfs
    version: 5.4.1
    remote: false
  - name: cephfs
    version: 15.2.14
    remote: true
  - name: dir
    version: "1"
    remote: false
  - name: lvm
    version: 2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.45.0
    remote: false
  - name: zfs
    version: 2.1.2-1ubuntu3
    remote: false

Container log

paride@diglett:~$ lxc info paride-b-ephemeral --show-log
Name: paride-b-ephemeral
Status: RUNNING
Type: container (ephemeral)
Architecture: x86_64
PID: 192718
Created: 2022/04/13 13:24 UTC
Last Used: 2022/04/13 13:24 UTC

Resources:
  Processes: 24
  Disk usage:
    root: 16.54MiB
  CPU usage:
    CPU usage (in seconds): 8
  Memory usage:
    Memory (current): 117.92MiB
  Network usage:
    eth0:
      Type: broadcast
      State: UP
      Host interface: vethbd5acf89
      MAC address: 00:16:3e:a8:d8:2d
      MTU: 1500
      Bytes received: 17.24kB
      Bytes sent: 7.51kB
      Packets received: 75
      Packets sent: 69
      IP addresses:
        inet:  10.109.225.67/24 (global)
        inet6: fd42:1b7e:739f:50c0:216:3eff:fea8:d82d/64 (global)
        inet6: fe80::216:3eff:fea8:d82d/64 (link)
    lo:
      Type: loopback
      State: UP
      MTU: 65536
      Bytes received: 1.07kB
      Bytes sent: 1.07kB
      Packets received: 12
      Packets sent: 12
      IP addresses:
        inet:  127.0.0.1/8 (local)
        inet6: ::1/128 (local)

Log:

lxc paride-b-ephemeral 20220413132420.363 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132420.363 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc paride-b-ephemeral 20220413132420.363 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132420.363 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc paride-b-ephemeral 20220413132513.331 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132513.331 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc paride-b-ephemeral 20220413132514.852 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132514.852 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc paride-b-ephemeral 20220413132517.319 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132517.319 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc paride-b-ephemeral 20220413132533.117 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132533.117 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc paride-b-ephemeral 20220413132543.311 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132543.311 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing
lxc paride-b-ephemeral 20220413132550.296 WARN     conf - conf.c:lxc_map_ids:3592 - newuidmap binary is missing
lxc paride-b-ephemeral 20220413132550.297 WARN     conf - conf.c:lxc_map_ids:3598 - newgidmap binary is missing

Container config

paride@diglett:~$ lxc config show paride-b-ephemeral --expanded
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20220325)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20220325"
  image.type: squashfs
  image.version: "18.04"
  volatile.base_image: 5925c17819b3098041c9a699d04d21fe5344017321470ff8ba219eb67c597443
  volatile.cloud-init.instance-id: f5f268b9-c13e-4c88-9d90-d9b85e5f7190
  volatile.eth0.host_name: vethbd5acf89
  volatile.eth0.hwaddr: 00:16:3e:a8:d8:2d
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 783709f0-cb70-47bd-904d-dc509af176be
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: true
profiles:
- default
stateful: false
description: ""
paride commented 2 years ago

Also: ephemeral VMs appear to be nonfunctional, perhaps because they need to reboot before becoming available:

paride@diglett:~$ lxc launch ubuntu:focal paride-f --vm --ephemeral
Creating paride-f
Starting paride-f

[sleep several minutes]

paride@diglett:~$ lxc exec paride-f uptime
Error: LXD VM agent isn't currently running