canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.32k stars 926 forks source link

Introduce starting/stopping/restarting instance states #10625

Open jpnurmi opened 2 years ago

jpnurmi commented 2 years ago

Required information

lxc info ``` config: core.https_address: '[::]:8443' core.trust_password: true api_extensions: - storage_zfs_remove_snapshots - container_host_shutdown_timeout - container_stop_priority - container_syscall_filtering - auth_pki - container_last_used_at - etag - patch - usb_devices - https_allowed_credentials - image_compression_algorithm - directory_manipulation - container_cpu_time - storage_zfs_use_refquota - storage_lvm_mount_options - network - profile_usedby - container_push - container_exec_recording - certificate_update - container_exec_signal_handling - gpu_devices - container_image_properties - migration_progress - id_map - network_firewall_filtering - network_routes - storage - file_delete - file_append - network_dhcp_expiry - storage_lvm_vg_rename - storage_lvm_thinpool_rename - network_vlan - image_create_aliases - container_stateless_copy - container_only_migration - storage_zfs_clone_copy - unix_device_rename - storage_lvm_use_thinpool - storage_rsync_bwlimit - network_vxlan_interface - storage_btrfs_mount_options - entity_description - image_force_refresh - storage_lvm_lv_resizing - id_map_base - file_symlinks - container_push_target - network_vlan_physical - storage_images_delete - container_edit_metadata - container_snapshot_stateful_migration - storage_driver_ceph - storage_ceph_user_name - resource_limits - storage_volatile_initial_source - storage_ceph_force_osd_reuse - storage_block_filesystem_btrfs - resources - kernel_limits - storage_api_volume_rename - macaroon_authentication - network_sriov - console - restrict_devlxd - migration_pre_copy - infiniband - maas_network - devlxd_events - proxy - network_dhcp_gateway - file_get_symlink - network_leases - unix_device_hotplug - storage_api_local_volume_handling - operation_description - clustering - event_lifecycle - storage_api_remote_volume_handling - nvidia_runtime - container_mount_propagation - container_backup - devlxd_images - container_local_cross_pool_handling - proxy_unix - proxy_udp - clustering_join - proxy_tcp_udp_multi_port_handling - network_state - proxy_unix_dac_properties - container_protection_delete - unix_priv_drop - pprof_http - proxy_haproxy_protocol - network_hwaddr - proxy_nat - network_nat_order - container_full - candid_authentication - backup_compression - candid_config - nvidia_runtime_config - storage_api_volume_snapshots - storage_unmapped - projects - candid_config_key - network_vxlan_ttl - container_incremental_copy - usb_optional_vendorid - snapshot_scheduling - snapshot_schedule_aliases - container_copy_project - clustering_server_address - clustering_image_replication - container_protection_shift - snapshot_expiry - container_backup_override_pool - snapshot_expiry_creation - network_leases_location - resources_cpu_socket - resources_gpu - resources_numa - kernel_features - id_map_current - event_location - storage_api_remote_volume_snapshots - network_nat_address - container_nic_routes - rbac - cluster_internal_copy - seccomp_notify - lxc_features - container_nic_ipvlan - network_vlan_sriov - storage_cephfs - container_nic_ipfilter - resources_v2 - container_exec_user_group_cwd - container_syscall_intercept - container_disk_shift - storage_shifted - resources_infiniband - daemon_storage - instances - image_types - resources_disk_sata - clustering_roles - images_expiry - resources_network_firmware - backup_compression_algorithm - ceph_data_pool_name - container_syscall_intercept_mount - compression_squashfs - container_raw_mount - container_nic_routed - container_syscall_intercept_mount_fuse - container_disk_ceph - virtual-machines - image_profiles - clustering_architecture - resources_disk_id - storage_lvm_stripes - vm_boot_priority - unix_hotplug_devices - api_filtering - instance_nic_network - clustering_sizing - firewall_driver - projects_limits - container_syscall_intercept_hugetlbfs - limits_hugepages - container_nic_routed_gateway - projects_restrictions - custom_volume_snapshot_expiry - volume_snapshot_scheduling - trust_ca_certificates - snapshot_disk_usage - clustering_edit_roles - container_nic_routed_host_address - container_nic_ipvlan_gateway - resources_usb_pci - resources_cpu_threads_numa - resources_cpu_core_die - api_os - container_nic_routed_host_table - container_nic_ipvlan_host_table - container_nic_ipvlan_mode - resources_system - images_push_relay - network_dns_search - container_nic_routed_limits - instance_nic_bridged_vlan - network_state_bond_bridge - usedby_consistency - custom_block_volumes - clustering_failure_domains - resources_gpu_mdev - console_vga_type - projects_limits_disk - network_type_macvlan - network_type_sriov - container_syscall_intercept_bpf_devices - network_type_ovn - projects_networks - projects_networks_restricted_uplinks - custom_volume_backup - backup_override_name - storage_rsync_compression - network_type_physical - network_ovn_external_subnets - network_ovn_nat - network_ovn_external_routes_remove - tpm_device_type - storage_zfs_clone_copy_rebase - gpu_mdev - resources_pci_iommu - resources_network_usb - resources_disk_address - network_physical_ovn_ingress_mode - network_ovn_dhcp - network_physical_routes_anycast - projects_limits_instances - network_state_vlan - instance_nic_bridged_port_isolation - instance_bulk_state_change - network_gvrp - instance_pool_move - gpu_sriov - pci_device_type - storage_volume_state - network_acl - migration_stateful - disk_state_quota - storage_ceph_features - projects_compression - projects_images_remote_cache_expiry - certificate_project - network_ovn_acl - projects_images_auto_update - projects_restricted_cluster_target - images_default_architecture - network_ovn_acl_defaults - gpu_mig - project_usage - network_bridge_acl - warnings - projects_restricted_backups_and_snapshots - clustering_join_token - clustering_description - server_trusted_proxy - clustering_update_cert - storage_api_project - server_instance_driver_operational - server_supported_storage_drivers - event_lifecycle_requestor_address - resources_gpu_usb - clustering_evacuation - network_ovn_nat_address - network_bgp - network_forward - custom_volume_refresh - network_counters_errors_dropped - metrics - image_source_project - clustering_config - network_peer - linux_sysctl - network_dns - ovn_nic_acceleration - certificate_self_renewal - instance_project_move - storage_volume_project_move - cloud_init - network_dns_nat - database_leader - instance_all_projects - clustering_groups - ceph_rbd_du - instance_get_full - qemu_metrics - gpu_mig_uuid - event_project - clustering_evacuation_live - instance_allow_inconsistent_copy - network_state_ovn - storage_volume_api_filtering - image_restrictions - storage_zfs_export - network_dns_records - storage_zfs_reserve_space - network_acl_log - storage_zfs_blocksize - metrics_cpu_seconds - instance_snapshot_never - certificate_token - instance_nic_routed_neighbor_probe - event_hub - agent_nic_config - projects_restricted_intercept - metrics_authentication - images_target_project - cluster_migration_inconsistent_copy - cluster_ovn_chassis - container_syscall_intercept_sched_setscheduler - storage_lvm_thinpool_metadata_size - storage_volume_state_total - instance_file_head - instances_nic_host_name - image_copy_profile - container_syscall_intercept_sysinfo - clustering_evacuation_mode - resources_pci_vpd - qemu_raw_conf - storage_cephfs_fscache api_status: stable api_version: "1.0" auth: trusted public: false auth_methods: - tls environment: addresses: - 192.168.86.47:8443 - 10.64.43.1:8443 - 172.17.0.1:8443 - 10.66.41.1:8443 - '[fd42:96c2:5811:78fd::1]:8443' architectures: - x86_64 - i686 certificate: | -----BEGIN CERTIFICATE----- MIICEzCCAZmgAwIBAgIRALCumPkS068O0UA8SuJgOdEwCgYIKoZIzj0EAwMwOTEc MBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEZMBcGA1UEAwwQcm9vdEB4cHMt MTUtOTUyMDAeFw0yMjA2MTQyMDMxMTZaFw0zMjA2MTEyMDMxMTZaMDkxHDAaBgNV BAoTE2xpbnV4Y29udGFpbmVycy5vcmcxGTAXBgNVBAMMEHJvb3RAeHBzLTE1LTk1 MjAwdjAQBgcqhkjOPQIBBgUrgQQAIgNiAAQCZH/Ulw9tIlkqGdLvkElD6e9LTGjx qxTED9TcKdjfshj5snAeuwcHTcBiYIVUStgfoc8r79QB6toCo92/JgvliCPMO4ao 5QhZM5qdZEQucENiHWbd7Z7GEzJ8CkfQNgijZTBjMA4GA1UdDwEB/wQEAwIFoDAT BgNVHSUEDDAKBggrBgEFBQcDATAMBgNVHRMBAf8EAjAAMC4GA1UdEQQnMCWCC3hw cy0xNS05NTIwhwR/AAABhxAAAAAAAAAAAAAAAAAAAAABMAoGCCqGSM49BAMDA2gA MGUCMQCcOkTaX2nElJyI+NKt4PJwvldAdj0C2tv8H3lG4StP3Eh4USJpH+DroYjG MdggveoCMCJc8R0ZL4LHZTtoMyTPm4laj/4Ab/L9u4gDvcQgVlLtQnS7iNX2/YzJ F+FfQ1fQxA== -----END CERTIFICATE----- certificate_fingerprint: 8ea6348710db867f45ab614d8ce4dc8690b5a16746bdc92f1cb29b10d192e70f driver: lxc | qemu driver_version: 5.0.0 | 7.0.0 firewall: nftables kernel: Linux kernel_architecture: x86_64 kernel_features: idmapped_mounts: "true" netnsid_getifaddrs: "true" seccomp_listener: "true" seccomp_listener_continue: "true" shiftfs: "false" uevent_injection: "true" unpriv_fscaps: "true" kernel_version: 5.15.0-39-generic lxc_features: cgroup2: "true" core_scheduling: "true" devpts_fd: "true" idmapped_mounts_v2: "true" mount_injection_file: "true" network_gateway_device_route: "true" network_ipvlan: "true" network_l2proxy: "true" network_phys_macvlan_mtu: "true" network_veth_router: "true" pidfd: "true" seccomp_allow_deny_syntax: "true" seccomp_notify: "true" seccomp_proxy_send_notify_fd: "true" os_name: Ubuntu os_version: "22.10" project: default server: lxd server_clustered: false server_event_mode: full-mesh server_name: xps-15-9520 server_pid: 2044 server_version: "5.3" storage: dir storage_version: "1" storage_supported_drivers: - name: zfs version: 2.1.2-1ubuntu3 remote: false - name: ceph version: 15.2.16 remote: true - name: btrfs version: 5.4.1 remote: false - name: cephfs version: 15.2.16 remote: true - name: dir version: "1" remote: false - name: lvm version: 2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.45.0 remote: false ```

Issue description

I'm tracking the status of instances with a websocket connected to the /1.0/events stream. The problem is that intermediate instance statuses are not being reported. For example, when starting and stopping instances, querying the instance status in response to any related operation or lifecycle event results in either "Running" or "Stopped", never "Starting" or "Stopping".

When starting an instance, LXD immediately sends the following operation events:

{"type":"operation","timestamp":"2022-06-30T14:36:35.269507Z","metadata":{"id":"a96a3a0a-f4fe-4858-a9bb-6d131b77b7dc","class":"task","description":"Starting instance","created_at":"2022-06-30T16:36:35.259107994+02:00","updated_at":"2022-06-30T16:36:35.259107994+02:00","status":"Pending","status_code":105,"resources":{"instances":["/1.0/instances/hardy-snake"]},"metadata":null,"may_cancel":false,"err":"","location":"none"},"location":"none","project":"default"}
{"type":"operation","timestamp":"2022-06-30T14:36:35.269560Z","metadata":{"id":"a96a3a0a-f4fe-4858-a9bb-6d131b77b7dc","class":"task","description":"Starting instance","created_at":"2022-06-30T16:36:35.259107994+02:00","updated_at":"2022-06-30T16:36:35.259107994+02:00","status":"Running","status_code":103,"resources":{"instances":["/1.0/instances/hardy-snake"]},"metadata":null,"may_cancel":false,"err":"","location":"none"},"location":"none","project":"default"}

At this point, the instance status is still "Stopped". After a few seconds, LXD sends two more events:

{"type":"lifecycle","timestamp":"2022-06-30T14:36:38.973268Z","metadata":{"action":"instance-started","source":"/1.0/instances/hardy-snake","requestor":{"username":"jpnurmi","protocol":"unix","address":"@"}},"location":"none","project":"default"}
{"type":"operation","timestamp":"2022-06-30T14:36:38.973389Z","metadata":{"id":"a96a3a0a-f4fe-4858-a9bb-6d131b77b7dc","class":"task","description":"Starting instance","created_at":"2022-06-30T16:36:35.259107994+02:00","updated_at":"2022-06-30T16:36:35.259107994+02:00","status":"Success","status_code":200,"resources":{"instances":["/1.0/instances/hardy-snake"]},"metadata":null,"may_cancel":false,"err":"","location":"none"},"location":"none","project":"default"}

At this point, the instance status is already "Running". I'm missing the "Starting" status in between, to be able to visualize it in a GUI.

Steps to reproduce

  1. Connect to /1.0/events?type=operation,lifecycle
  2. Start or stop an instance
  3. GET /1.0/instances/{name} whenever a lifecycle or operation event is received for that instance

Expected result: either separate "Starting" and "Stopping" lifecycle events, or up-to-date instance status when the operation events are sent Actual result: the status goes straight from "Stopped" to "Running" and back

stgraber commented 2 years ago

Pending => Running => Success refers to the background operation.

If you want to map that to instance status on your side, you could probably do:

LXD internally doesn't have the concept of a starting/stopping type status so you're not going to see this ever reflected in the instance status itself.

jpnurmi commented 2 years ago

Could you consider this a feature request to add "Starting" and "Stopping" instance statuses and lifecycle events?

stgraber commented 2 years ago

The lifecycle events I don't think really make sense as the goal for the lifecycle API is to effectively be an audit log of everything that changed. We want this stream to be somewhat concise so I'm not particularly keen on tripling the number of events available (we'd need 3x because if we're introducing a starting, we'd also need a failed whereas we currently only report successes).

For states. This should be doable, though we wouldn't want those intermediate states to ever make it into the database as we're already dealing with far more DB writes than we'd want. We'd effectively base those states on whether we have an internal ongoing instance operation of a given type. That should be something we can retrieve easily enough from within RenderState.

jpnurmi commented 2 years ago

What would be the best way to detect state changes that could be originating from other clients? The UI should update if an instance is started or stopped using lxc, for example.

Given that the description of operation events is freeform, the UI does not try to interpret them but blindly sends a state request whenever it sees any operation event that mentions an instance in the resources field. This is why I proposed those lifecycle events. :)

stgraber commented 2 years ago

Part of what's making this hard for us to figure out is that there's really no such thing as stopping or restarting.

When you ask for LXD to stop or restart an instance, we emit an event to the init system of the instance. Whether the instance does something with it, we have no idea. So introducing a stopping/restarting state is a bit awkward because we don't know that we're actually in this state. If the init system doesn't take action, we're still running but we can't know that.

Similarly, if we line this up with our internal operation, we will be reporting stopping/restarting for a period of 30s, then the internal operation will expire and you may just be back to a running state. But being back to such a running state doesn't mean that the instance didn't get the signal and won't go straight to stopped a couple minutes later.

stgraber commented 2 years ago

The same is true should we introduce additional lifecycle events.

If a shutdown/restart is initiated from within the guest, LXD won't know about it, so you'll go straight to one of:

If the shutdown/restart is initiated over the API, then we could do one of:

But there's no guarantee that we'll go from that to one of:

If the instance doesn't stop or restart within 30s, the operation will expire and we could in theory return one of:

But again, this doesn't mean the instance didn't get the signal and if it did, then a few minutes later, you may get one of:

Coming effectively out of nowhere.

jpnurmi commented 2 years ago

Thanks for the explanation. I understand that it's impossible for LXD to know what an instance is doing with a start/stop request. From the UI's perspective, it would be fine if the state gets later fixed up and jumps straight back to a known state. It's just important for the user to be able to see that LXD is in the process of doing/trying/requesting something. In that sense, a separate requested_state field, a volatile flag, or anything alike would also work fine.

jpnurmi commented 2 years ago

Exposing operation types (InstanceStart, InstanceStop, ...) in the API could be a fine alternative to the lifecycle events.

turtle0x1 commented 2 years ago

I also rebuild UI based on the lifecycle callbacks so I some what understand your need for this, but at the same time - what about the multi user scenario? Are you going to display a "spinning cog" and then display an error? The poor user/s who didn't try to start that instance... why are the are they seeing an error?

To me i'd say its down to the user that tried to start the instance, and then waiting on the OP , everyone with access to the project shouldn't be notified IMO.

Failing to start because of user input seems like something that shouldn't be an event.

Failing to start because of a host issue seems like a warning (or what else are they for?)

jpnurmi commented 2 years ago

How to present the state and whether to distinguish the client's "own" operations is a matter of UX.

Exposing the state could be useful even for the LXC command line tool. For example, instead of hanging, lxc stop could guide the user or perhaps even offer to force-stop an instance that the server is already attempting to stop.

turtle0x1 commented 2 years ago

client's "own" operations is a matter of UX.

Kinda, still relying on LXD to tell you the originator of the request (or your own struct which ... meh), only really works if your issuing a cert per user that's "logged in" to compare against (which I would argue is probably rarer in the case of web interfaces as they dont need a cert as its effectively a proxy).

Cant really comment on the LXC CLI, assume it would require them to start listening on the events socket after the fact (which may take longer than the OP and become racy).