lxc / incus

Powerful system container and virtual machine manager
https://linuxcontainers.org/incus
Apache License 2.0
2.83k stars 225 forks source link

incusd crashes when a message is written to console websocket #1449

Open tsuzu opened 1 day ago

tsuzu commented 1 day ago

Required information

incus info ``` config: cluster.https_address: :8443 core.https_address: :8443 oidc.audience: oidc.client.id: oidc.issuer: api_extensions: - storage_zfs_remove_snapshots - container_host_shutdown_timeout - container_stop_priority - container_syscall_filtering - auth_pki - container_last_used_at - etag - patch - usb_devices - https_allowed_credentials - image_compression_algorithm - directory_manipulation - container_cpu_time - storage_zfs_use_refquota - storage_lvm_mount_options - network - profile_usedby - container_push - container_exec_recording - certificate_update - container_exec_signal_handling - gpu_devices - container_image_properties - migration_progress - id_map - network_firewall_filtering - network_routes - storage - file_delete - file_append - network_dhcp_expiry - storage_lvm_vg_rename - storage_lvm_thinpool_rename - network_vlan - image_create_aliases - container_stateless_copy - container_only_migration - storage_zfs_clone_copy - unix_device_rename - storage_lvm_use_thinpool - storage_rsync_bwlimit - network_vxlan_interface - storage_btrfs_mount_options - entity_description - image_force_refresh - storage_lvm_lv_resizing - id_map_base - file_symlinks - container_push_target - network_vlan_physical - storage_images_delete - container_edit_metadata - container_snapshot_stateful_migration - storage_driver_ceph - storage_ceph_user_name - resource_limits - storage_volatile_initial_source - storage_ceph_force_osd_reuse - storage_block_filesystem_btrfs - resources - kernel_limits - storage_api_volume_rename - network_sriov - console - restrict_dev_incus - migration_pre_copy - infiniband - dev_incus_events - proxy - network_dhcp_gateway - file_get_symlink - network_leases - unix_device_hotplug - storage_api_local_volume_handling - operation_description - clustering - event_lifecycle - storage_api_remote_volume_handling - nvidia_runtime - container_mount_propagation - container_backup - dev_incus_images - container_local_cross_pool_handling - proxy_unix - proxy_udp - clustering_join - proxy_tcp_udp_multi_port_handling - network_state - proxy_unix_dac_properties - container_protection_delete - unix_priv_drop - pprof_http - proxy_haproxy_protocol - network_hwaddr - proxy_nat - network_nat_order - container_full - backup_compression - nvidia_runtime_config - storage_api_volume_snapshots - storage_unmapped - projects - network_vxlan_ttl - container_incremental_copy - usb_optional_vendorid - snapshot_scheduling - snapshot_schedule_aliases - container_copy_project - clustering_server_address - clustering_image_replication - container_protection_shift - snapshot_expiry - container_backup_override_pool - snapshot_expiry_creation - network_leases_location - resources_cpu_socket - resources_gpu - resources_numa - kernel_features - id_map_current - event_location - storage_api_remote_volume_snapshots - network_nat_address - container_nic_routes - cluster_internal_copy - seccomp_notify - lxc_features - container_nic_ipvlan - network_vlan_sriov - storage_cephfs - container_nic_ipfilter - resources_v2 - container_exec_user_group_cwd - container_syscall_intercept - container_disk_shift - storage_shifted - resources_infiniband - daemon_storage - instances - image_types - resources_disk_sata - clustering_roles - images_expiry - resources_network_firmware - backup_compression_algorithm - ceph_data_pool_name - container_syscall_intercept_mount - compression_squashfs - container_raw_mount - container_nic_routed - container_syscall_intercept_mount_fuse - container_disk_ceph - virtual-machines - image_profiles - clustering_architecture - resources_disk_id - storage_lvm_stripes - vm_boot_priority - unix_hotplug_devices - api_filtering - instance_nic_network - clustering_sizing - firewall_driver - projects_limits - container_syscall_intercept_hugetlbfs - limits_hugepages - container_nic_routed_gateway - projects_restrictions - custom_volume_snapshot_expiry - volume_snapshot_scheduling - trust_ca_certificates - snapshot_disk_usage - clustering_edit_roles - container_nic_routed_host_address - container_nic_ipvlan_gateway - resources_usb_pci - resources_cpu_threads_numa - resources_cpu_core_die - api_os - container_nic_routed_host_table - container_nic_ipvlan_host_table - container_nic_ipvlan_mode - resources_system - images_push_relay - network_dns_search - container_nic_routed_limits - instance_nic_bridged_vlan - network_state_bond_bridge - usedby_consistency - custom_block_volumes - clustering_failure_domains - resources_gpu_mdev - console_vga_type - projects_limits_disk - network_type_macvlan - network_type_sriov - container_syscall_intercept_bpf_devices - network_type_ovn - projects_networks - projects_networks_restricted_uplinks - custom_volume_backup - backup_override_name - storage_rsync_compression - network_type_physical - network_ovn_external_subnets - network_ovn_nat - network_ovn_external_routes_remove - tpm_device_type - storage_zfs_clone_copy_rebase - gpu_mdev - resources_pci_iommu - resources_network_usb - resources_disk_address - network_physical_ovn_ingress_mode - network_ovn_dhcp - network_physical_routes_anycast - projects_limits_instances - network_state_vlan - instance_nic_bridged_port_isolation - instance_bulk_state_change - network_gvrp - instance_pool_move - gpu_sriov - pci_device_type - storage_volume_state - network_acl - migration_stateful - disk_state_quota - storage_ceph_features - projects_compression - projects_images_remote_cache_expiry - certificate_project - network_ovn_acl - projects_images_auto_update - projects_restricted_cluster_target - images_default_architecture - network_ovn_acl_defaults - gpu_mig - project_usage - network_bridge_acl - warnings - projects_restricted_backups_and_snapshots - clustering_join_token - clustering_description - server_trusted_proxy - clustering_update_cert - storage_api_project - server_instance_driver_operational - server_supported_storage_drivers - event_lifecycle_requestor_address - resources_gpu_usb - clustering_evacuation - network_ovn_nat_address - network_bgp - network_forward - custom_volume_refresh - network_counters_errors_dropped - metrics - image_source_project - clustering_config - network_peer - linux_sysctl - network_dns - ovn_nic_acceleration - certificate_self_renewal - instance_project_move - storage_volume_project_move - cloud_init - network_dns_nat - database_leader - instance_all_projects - clustering_groups - ceph_rbd_du - instance_get_full - qemu_metrics - gpu_mig_uuid - event_project - clustering_evacuation_live - instance_allow_inconsistent_copy - network_state_ovn - storage_volume_api_filtering - image_restrictions - storage_zfs_export - network_dns_records - storage_zfs_reserve_space - network_acl_log - storage_zfs_blocksize - metrics_cpu_seconds - instance_snapshot_never - certificate_token - instance_nic_routed_neighbor_probe - event_hub - agent_nic_config - projects_restricted_intercept - metrics_authentication - images_target_project - images_all_projects - cluster_migration_inconsistent_copy - cluster_ovn_chassis - container_syscall_intercept_sched_setscheduler - storage_lvm_thinpool_metadata_size - storage_volume_state_total - instance_file_head - instances_nic_host_name - image_copy_profile - container_syscall_intercept_sysinfo - clustering_evacuation_mode - resources_pci_vpd - qemu_raw_conf - storage_cephfs_fscache - network_load_balancer - vsock_api - instance_ready_state - network_bgp_holdtime - storage_volumes_all_projects - metrics_memory_oom_total - storage_buckets - storage_buckets_create_credentials - metrics_cpu_effective_total - projects_networks_restricted_access - storage_buckets_local - loki - acme - internal_metrics - cluster_join_token_expiry - remote_token_expiry - init_preseed - storage_volumes_created_at - cpu_hotplug - projects_networks_zones - network_txqueuelen - cluster_member_state - instances_placement_scriptlet - storage_pool_source_wipe - zfs_block_mode - instance_generation_id - disk_io_cache - amd_sev - storage_pool_loop_resize - migration_vm_live - ovn_nic_nesting - oidc - network_ovn_l3only - ovn_nic_acceleration_vdpa - cluster_healing - instances_state_total - auth_user - security_csm - instances_rebuild - numa_cpu_placement - custom_volume_iso - network_allocations - zfs_delegate - storage_api_remote_volume_snapshot_copy - operations_get_query_all_projects - metadata_configuration - syslog_socket - event_lifecycle_name_and_project - instances_nic_limits_priority - disk_initial_volume_configuration - operation_wait - image_restriction_privileged - cluster_internal_custom_volume_copy - disk_io_bus - storage_cephfs_create_missing - instance_move_config - ovn_ssl_config - certificate_description - disk_io_bus_virtio_blk - loki_config_instance - instance_create_start - clustering_evacuation_stop_options - boot_host_shutdown_action - agent_config_drive - network_state_ovn_lr - image_template_permissions - storage_bucket_backup - storage_lvm_cluster - shared_custom_block_volumes - auth_tls_jwt - oidc_claim - device_usb_serial - numa_cpu_balanced - image_restriction_nesting - network_integrations - instance_memory_swap_bytes - network_bridge_external_create - network_zones_all_projects - storage_zfs_vdev - container_migration_stateful - profiles_all_projects - instances_scriptlet_get_instances - instances_scriptlet_get_cluster_members - instances_scriptlet_get_project - network_acl_stateless - instance_state_started_at - networks_all_projects - network_acls_all_projects - storage_buckets_all_projects - resources_load - instance_access - project_access - projects_force_delete - resources_cpu_flags - disk_io_bus_cache_filesystem - instance_oci - clustering_groups_config - instances_lxcfs_per_instance - clustering_groups_vm_cpu_definition - disk_volume_subpath - projects_limits_disk_pool - network_ovn_isolated - qemu_raw_qmp - network_load_balancer_health_check - oidc_scopes - network_integrations_peer_name - qemu_scriptlet - instance_auto_restart - storage_lvm_metadatasize - ovn_nic_promiscuous - ovn_nic_ip_address_none - instances_state_os_info - network_load_balancer_state - instance_nic_macvlan_mode - storage_lvm_cluster_create - network_ovn_external_interfaces - instances_scriptlet_get_instances_count - cluster_rebalance - custom_volume_refresh_exclude_older_snapshots api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls - oidc auth_user_name: root auth_user_method: unix environment: addresses: - architectures: - x86_64 - i686 certificate: | -----BEGIN CERTIFICATE----- MIIB/zCCAYagAwIBAgIQWRCbU44hrisQmKspUyDV8jAKBggqhkjOPQQDAzAyMRkw FwYDVQQKExBMaW51eCBDb250YWluZXJzMRUwEwYDVQQDDAxyb290QHBhcHJpa2Ew HhcNMjQxMTI1MTg0ODUzWhcNMzQxMTIzMTg0ODUzWjAyMRkwFwYDVQQKExBMaW51 eCBDb250YWluZXJzMRUwEwYDVQQDDAxyb290QHBhcHJpa2EwdjAQBgcqhkjOPQIB BgUrgQQAIgNiAASZI1i282jvdDzNX9aYzAw+cgPMVjH17A6a0VB/QQWaxn31W67h ee5CEkrGyFnyi6aiJJLuvpkJ39XWQgr5qEorIb9MgTsAmvJKTnBuZIMv4JQeChEi IGS/BbBJCCdlchWjYTBfMA4GA1UdDwEB/wQEAwIFoDATBgNVHSUEDDAKBggrBgEF BQcDATAMBgNVHRMBAf8EAjAAMCoGA1UdEQQjMCGCB3BhcHJpa2GHBH8AAAGHEAAA AAAAAAAAAAAAAAAAAAEwCgYIKoZIzj0EAwMDZwAwZAIwI/dyfbAzC8BrHmApElAt jLaZQoOQtrkfNQc7Njon+H6unHsjp8MO1hRDz/omhBE6AjA8XASjj6LD0hzMA4Pa ou1D30Yt9GrlVbt4bBbSdmKjqkjNzuCOzJC5fEQmrVNY7qU= -----END CERTIFICATE----- certificate_fingerprint: d87686e669cb458392f13f57e74da9f5c40d95b945514d61445daf4c18629d40 driver: lxc | qemu driver_version: 6.0.2 | 9.0.3 firewall: nftables kernel: Linux kernel_architecture: x86_64 kernel_features: idmapped_mounts: "true" netnsid_getifaddrs: "true" seccomp_listener: "true" seccomp_listener_continue: "true" uevent_injection: "true" unpriv_binfmt: "false" unpriv_fscaps: "true" kernel_version: 6.6.21-flatcar lxc_features: cgroup2: "true" core_scheduling: "true" devpts_fd: "true" idmapped_mounts_v2: "true" mount_injection_file: "true" network_gateway_device_route: "true" network_ipvlan: "true" network_l2proxy: "true" network_phys_macvlan_mtu: "true" network_veth_router: "true" pidfd: "true" seccomp_allow_deny_syntax: "true" seccomp_notify: "true" seccomp_proxy_send_notify_fd: "true" os_name: Debian GNU/Linux os_version: "12" project: default server: incus server_clustered: true server_event_mode: full-mesh server_name: paprika server_pid: 215824 server_version: "6.7" storage: lvm storage_version: 2.03.16(2) (2022-05-18) / 1.02.185 (2022-05-18) / 4.48.0 storage_supported_drivers: - name: dir version: "1" remote: false - name: lvm version: 2.03.16(2) (2022-05-18) / 1.02.185 (2022-05-18) / 4.48.0 remote: false - name: lvmcluster version: 2.03.16(2) (2022-05-18) / 1.02.185 (2022-05-18) / 4.48.0 remote: true - name: btrfs version: "6.2" remote: false ```

Issue description

When I opened Text console of a VM/Container on Web UI and then moved to another pane like Configuration, incusd crashed with the following output:

panic: concurrent write to websocket connection

goroutine 12380 [running]:
github.com/gorilla/websocket.(*messageWriter).flushFrame(0xc00087de68, 0x1, {0x373a6c0?, 0xc00087de01?, 0xc00087de98?})
        /root/go/pkg/mod/github.com/gorilla/websocket@v1.5.3/conn.go:617 +0x4b8
github.com/gorilla/websocket.(*Conn).WriteMessage(0xc0004f1340, 0x1fa654c?, {0x373a6c0, 0x0, 0x0})
        /root/go/pkg/mod/github.com/gorilla/websocket@v1.5.3/conn.go:770 +0x126
github.com/lxc/incus/v6/shared/ws.(*wrapper).Close(0xc0010620c0)
        /build/incus/shared/ws/rwc.go:92 +0x17f
github.com/lxc/incus/v6/shared/ws.MirrorRead.func1()
        /build/incus/shared/ws/mirror.go:38 +0x21d
created by github.com/lxc/incus/v6/shared/ws.MirrorRead in goroutine 12377
        /build/incus/shared/ws/mirror.go:32 +0x1e5

or

panic: concurrent write to websocket connection

goroutine 2786 [running]:
github.com/gorilla/websocket.(*messageWriter).flushFrame(0xc000871d48, 0x1, {0xc001957116?, 0x0?, 0xc000119508?})
        /root/go/pkg/mod/github.com/gorilla/websocket@v1.5.3/conn.go:617 +0x4b8
github.com/gorilla/websocket.(*Conn).WriteMessage(0xc0019109a0, 0x0?, {0xc001957116, 0x2, 0x2})
        /root/go/pkg/mod/github.com/gorilla/websocket@v1.5.3/conn.go:770 +0x126
main.(*consoleWs).doConsole.func4()
        /build/incus/cmd/incusd/instance_console.go:324 +0x49
main.(*consoleWs).doConsole(0xc00104e1e0, 0x0?)
        /build/incus/cmd/incusd/instance_console.go:345 +0x4c2
main.(*consoleWs).Do(0xc00104e1e0, 0xc001178780)
        /build/incus/cmd/incusd/instance_console.go:202 +0x89
github.com/lxc/incus/v6/internal/server/operations.(*Operation).Start.func1(0xc001178780)
        /build/incus/internal/server/operations/operations.go:290 +0x26
created by github.com/lxc/incus/v6/internal/server/operations.(*Operation).Start in goroutine 2602
        /build/incus/internal/server/operations/operations.go:289 +0x105

Steps to reproduce

I can reproduce the problem by

  1. Open the incus web UI
  2. Open Text console of a container
  3. Move to another pane like Configuration
  4. Go back to Step 2

After some tries, incusd will crash. However, I'm still looking for a way to reproduce this with CLI or a Go program.

Information to attach

INFO   [2024-12-02T16:09:27Z] Action: instance-console, Source: /1.0/instances/testvm, Requestor: oidc/github|8574909 (172.71.24.104)  type=console
DEBUG  [2024-12-02T16:09:27Z] [paprika] Started mirroring websocket         address="172.71.24.99:59566"
DEBUG  [2024-12-02T16:09:27Z] [paprika] Websocket: Started read mirror      address="172.71.24.99:59566"
DEBUG  [2024-12-02T16:09:27Z] [paprika] Websocket: Started write mirror     address="172.71.24.99:59566"
INFO   [2024-12-02T16:09:27Z] Action: instance-console, Source: /1.0/instances/testvm, Requestor: oidc/github|8574909 (172.71.24.104)  type=console
DEBUG  [2024-12-02T16:09:27Z] [paprika] Handling API request                ip="162.158.119.126:25678" method=GET protocol=oidc url="/1.0/operations/6490c63e-db50-4f2c-909d-86b6285a363c/websocket?secret=044caddb2b7fae88fb9e9a96e96f94c7adbe8844d972bafd11910b1774d2588f" username="github|8574909"
DEBUG  [2024-12-02T16:09:27Z] [paprika] Connecting to operation             class=websocket description="Showing console" operation=6490c63e-db50-4f2c-909d-86b6285a363c project=default
DEBUG  [2024-12-02T16:09:27Z] [paprika] Connected to operation              class=websocket description="Showing console" operation=6490c63e-db50-4f2c-909d-86b6285a363c project=default
DEBUG  [2024-12-02T16:09:29Z] [paprika] Websocket: Stopped read mirror      address="172.71.24.99:59566" err="read unix:@->/run/incus/testvm/qemu.console: file already closed"
DEBUG  [2024-12-02T16:09:29Z] [paprika] Got error getting next reader: websocket: close 1005 (no status)
DEBUG  [2024-12-02T16:09:29Z] [paprika] Finished mirroring console to websocket  address="172.71.24.99:59566"
DEBUG  [2024-12-02T16:09:29Z] [paprika] Websocket: Sending barrier message  address="172.71.24.99:59566"
DEBUG  [2024-12-02T16:09:29Z] [paprika] Console control websocket finished
Error: websocket: close 1006 (abnormal closure): unexpected EOF
tsuzu commented 1 day ago

I found a potentical cause. In doConsole, ws.Mirror for conns[0] is running on another goroutine. https://github.com/lxc/incus/blob/c2f495057d87624f1fa84d081718323fb011cc71/cmd/incusd/instance_console.go#L290-L327

When consoleDoneCh is closed before conns[0] is closed, conns[0].WriteMessage can be potentially called from two goroutines: doConsole goroutine https://github.com/lxc/incus/blob/c2f495057d87624f1fa84d081718323fb011cc71/cmd/incusd/instance_console.go#L324

and ws.Mirror goroutine https://github.com/lxc/incus/blob/c2f495057d87624f1fa84d081718323fb011cc71/shared/ws/rwc.go#L92

so can't we remove the first WriteMessage, which is called without mutex?