Vultaire commented 1 year ago

Issue description

Snap refreshes of LXD seem to intermittently trigger loss of /proc endpoints. The mounts still exist from the perspective of the containers, but attempting to access them results in messages such as ls: cannot access '/proc/stat': Transport endpoint is not connected.

This seems like it may be somewhat of a corner case, although on a cloud I'm supporting I've see this happen on 2 different hosts within roughly the last month.

The environment in question is running LXD on snap channel 5.0/stable on Ubuntu Focal.

Also, one message seems to jump out of the journalctl logs at me: Error: Error creating database: schema version '43' is more recent than expected '42'. This seems to trigger the LXD service going into a restart loop.

Steps to reproduce

Let snap refresh the LXD snap automatically. (Bug is intermittent and may not readily occur.)

Information to attach

For the most recent occurance which I observed today, I have an NRPE log entry indicating a problem while running an NRPE check:

 Service Warning[2023-01-17 21:36:42] SERVICE ALERT: <redacted>-heat-2;<redacted>-heat-2-corosync_proc;WARNING;SOFT;1;System call sent warnings to stderr: Error: /proc must be mounted

While not completely a smoking gun, as highly circumstantial evidence I have this from journalctl -u snap.lxd.daemon.service:

Jan 17 17:00:45 <redacted> lxd.daemon[3531867]: => LXD is ready
Jan 17 21:35:49 <redacted> systemd[1]: Stopping Service for snap application lxd.daemon...
Jan 17 21:35:49 <redacted> lxd.daemon[732373]: => Stop reason is: snap refresh
Jan 17 21:35:49 <redacted> lxd.daemon[732373]: => Stopping LXD
Jan 17 21:35:49 <redacted> lxd.daemon[3531867]: => LXD exited cleanly
Jan 17 21:35:50 <redacted> lxd.daemon[732373]: ==> Stopped LXD
Jan 17 21:35:50 <redacted> systemd[1]: snap.lxd.daemon.service: Succeeded.
Jan 17 21:35:50 <redacted> systemd[1]: Stopped Service for snap application lxd.daemon.
Jan 17 21:35:54 <redacted> systemd[1]: Started Service for snap application lxd.daemon.
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: => Preparing the system (23541)
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Loading snap configuration
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Setting up mntns symlink (mnt:[4026534411])
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Setting up kmod wrapper
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Preparing /boot
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Preparing a clean copy of /run
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Preparing /run/bin
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Preparing a clean copy of /etc
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Preparing a clean copy of /usr/share/misc
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Setting up ceph configuration
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Setting up LVM configuration
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Setting up OVN configuration
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Rotating logs
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Setting up ZFS (0.8)
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Escaping the systemd cgroups
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ====> Detected cgroup V1
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Escaping the systemd process resource limits
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: ==> Disabling shiftfs on this kernel (auto)
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: => Re-using existing LXCFS
Jan 17 21:35:54 <redacted> lxd.daemon[733180]: => Starting LXD
Jan 17 21:35:54 <redacted> lxd.daemon[733383]: time="2023-01-17T21:35:54Z" level=warning msg=" - Couldn't find the CGroup blkio.weight, disk priority will be ignored"
Jan 17 21:35:54 <redacted> lxd.daemon[733383]: time="2023-01-17T21:35:54Z" level=warning msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored"
Jan 17 21:35:54 <redacted> lxd.daemon[733383]: time="2023-01-17T21:35:54Z" level=error msg="Failed to start the daemon" err="Error creating database: schema version '43' is more recent than expec>
Jan 17 21:35:54 <redacted> lxd.daemon[733383]: Error: Error creating database: schema version '43' is more recent than expected '42'
Jan 17 21:35:55 <redacted> lxd.daemon[733180]: => LXD failed to start
Jan 17 21:35:55 <redacted> systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Jan 17 21:35:55 <redacted> systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.

The previous snap refresh occurred at 17:00:37, with its final log message included above. The next refresh, at 21:35, had an error resulting in a restart loop. The timing seems suspiciously close to when the /proc endpoints stopped responding from within the containers.

Required information

Distribution: Ubuntu
Distribution version: 20.04 Focal

The output of "lxc info":

config:
core.proxy_ignore_hosts: <REDACTED>,127.0.0.1,::1,localhost
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
addresses: []
architectures:
- x86_64
- i686
certificate: <REDACTED>
certificate_fingerprint: <REDACTED>
driver: lxc | qemu
driver_version: 5.0.1 | 7.1.0
firewall: nftables
kernel: Linux
kernel_architecture: x86_64
kernel_features:
idmapped_mounts: "false"
netnsid_getifaddrs: "true"
seccomp_listener: "true"
seccomp_listener_continue: "true"
shiftfs: "false"
uevent_injection: "true"
unpriv_fscaps: "true"
kernel_version: 5.4.0-124-generic
lxc_features:
cgroup2: "true"
core_scheduling: "true"
devpts_fd: "true"
idmapped_mounts_v2: "true"
mount_injection_file: "true"
network_gateway_device_route: "true"
network_ipvlan: "true"
network_l2proxy: "true"
network_phys_macvlan_mtu: "true"
network_veth_router: "true"
pidfd: "true"
seccomp_allow_deny_syntax: "true"
seccomp_notify: "true"
seccomp_proxy_send_notify_fd: "true"
os_name: Ubuntu
os_version: "20.04"
project: default
server: lxd
server_clustered: false
server_event_mode: full-mesh
server_name: <REDACTED>
server_pid: 1990946
server_version: "5.10"
storage: dir
storage_version: "1"
storage_supported_drivers:
- name: zfs
version: 0.8.3-1ubuntu12.14
remote: false
- name: btrfs
version: 5.4.1
remote: false
- name: ceph
version: 15.2.17
remote: true
- name: cephfs
version: 15.2.17
remote: true
- name: cephobject
version: 15.2.17
remote: true
- name: dir
version: "1"
remote: false
- name: lvm
version: 2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.41.0
remote: false

stgraber commented 1 year ago

That's a lxcfs crash, moving over to lxcfs.

The schema update thing is caused by snapd potentially having done a rollback for you. LXD can't actually be rolled back because of DB schema changes, you can only ever go forward in releases. Running snap refresh lxd should get you onto the latest revision, taking care of that part.

For the LXCFS crash. We've been adding some debugging logic in the snap to better catch some of those, as well as rolling out a variety of bugfixes with LXD 5.10. The issue with fixing lxcfs issues is that it needs a full restart of LXD (and all containers) to be effective, so it can get a bit tricky to know exactly what version you may be running.

mihalicyn commented 1 year ago

Hi @Vultaire!

Have you got any new reproductions of this issue?

As Stéphane said, we have added some extra debug logic into the LXD snap package and fixed a few issues in the LXCFS.

mihalicyn commented 8 months ago

Let's close for now. Feel free to reopen if issue is still actual and reproducible.

lxc / lxcfs

Intermittent loss of /proc endpoints on snap refresh #582

Issue description

Steps to reproduce

Information to attach

Required information