canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.33k stars 926 forks source link

lxd 5.18 candidate not retrieving buildd instances #12295

Closed lengau closed 1 year ago

lengau commented 1 year ago

The lxd candidate bump to 5.18 doesn't seem to handle downloads from linuxcontainers.org or from buildd as expected, which appears to break snapcraft and charmcraft if they need to download lxd images.

Required information

Issue description

lxd 5.18 (snap revision 25709) doesn't seem to be able to get images from buildd or from linuxcontainers.org.

Steps to reproduce

  1. Add the remote: lxc remote add craft-com.ubuntu.cloud-buildd https://cloud-images.ubuntu.com/buildd/releases
  2. Try to launch an image: lxc launch craft-com.ubuntu.cloud-buildd:core20 core20

Output:

Creating core20
Error: Failed instance creation: Failed getting remote image info: Failed getting image: The requested image couldn't be found

Information to attach

debug output ``` $ lxc launch craft-com.ubuntu.cloud-buildd:core20 core20 --debug DEBUG [2023-09-21T20:27:33-04:00] Connecting to a local LXD over a Unix socket DEBUG [2023-09-21T20:27:33-04:00] Sending request to LXD etag= method=GET url="http://unix.socket/1.0" DEBUG [2023-09-21T20:27:33-04:00] Got response struct from LXD DEBUG [2023-09-21T20:27:33-04:00] { "config": {}, "api_extensions": [ "storage_zfs_remove_snapshots", "container_host_shutdown_timeout", "container_stop_priority", "container_syscall_filtering", "auth_pki", "container_last_used_at", "etag", "patch", "usb_devices", "https_allowed_credentials", "image_compression_algorithm", "directory_manipulation", "container_cpu_time", "storage_zfs_use_refquota", "storage_lvm_mount_options", "network", "profile_usedby", "container_push", "container_exec_recording", "certificate_update", "container_exec_signal_handling", "gpu_devices", "container_image_properties", "migration_progress", "id_map", "network_firewall_filtering", "network_routes", "storage", "file_delete", "file_append", "network_dhcp_expiry", "storage_lvm_vg_rename", "storage_lvm_thinpool_rename", "network_vlan", "image_create_aliases", "container_stateless_copy", "container_only_migration", "storage_zfs_clone_copy", "unix_device_rename", "storage_lvm_use_thinpool", "storage_rsync_bwlimit", "network_vxlan_interface", "storage_btrfs_mount_options", "entity_description", "image_force_refresh", "storage_lvm_lv_resizing", "id_map_base", "file_symlinks", "container_push_target", "network_vlan_physical", "storage_images_delete", "container_edit_metadata", "container_snapshot_stateful_migration", "storage_driver_ceph", "storage_ceph_user_name", "resource_limits", "storage_volatile_initial_source", "storage_ceph_force_osd_reuse", "storage_block_filesystem_btrfs", "resources", "kernel_limits", "storage_api_volume_rename", "macaroon_authentication", "network_sriov", "console", "restrict_devlxd", "migration_pre_copy", "infiniband", "maas_network", "devlxd_events", "proxy", "network_dhcp_gateway", "file_get_symlink", "network_leases", "unix_device_hotplug", "storage_api_local_volume_handling", "operation_description", "clustering", "event_lifecycle", "storage_api_remote_volume_handling", "nvidia_runtime", "container_mount_propagation", "container_backup", "devlxd_images", "container_local_cross_pool_handling", "proxy_unix", "proxy_udp", "clustering_join", "proxy_tcp_udp_multi_port_handling", "network_state", "proxy_unix_dac_properties", "container_protection_delete", "unix_priv_drop", "pprof_http", "proxy_haproxy_protocol", "network_hwaddr", "proxy_nat", "network_nat_order", "container_full", "candid_authentication", "backup_compression", "candid_config", "nvidia_runtime_config", "storage_api_volume_snapshots", "storage_unmapped", "projects", "candid_config_key", "network_vxlan_ttl", "container_incremental_copy", "usb_optional_vendorid", "snapshot_scheduling", "snapshot_schedule_aliases", "container_copy_project", "clustering_server_address", "clustering_image_replication", "container_protection_shift", "snapshot_expiry", "container_backup_override_pool", "snapshot_expiry_creation", "network_leases_location", "resources_cpu_socket", "resources_gpu", "resources_numa", "kernel_features", "id_map_current", "event_location", "storage_api_remote_volume_snapshots", "network_nat_address", "container_nic_routes", "rbac", "cluster_internal_copy", "seccomp_notify", "lxc_features", "container_nic_ipvlan", "network_vlan_sriov", "storage_cephfs", "container_nic_ipfilter", "resources_v2", "container_exec_user_group_cwd", "container_syscall_intercept", "container_disk_shift", "storage_shifted", "resources_infiniband", "daemon_storage", "instances", "image_types", "resources_disk_sata", "clustering_roles", "images_expiry", "resources_network_firmware", "backup_compression_algorithm", "ceph_data_pool_name", "container_syscall_intercept_mount", "compression_squashfs", "container_raw_mount", "container_nic_routed", "container_syscall_intercept_mount_fuse", "container_disk_ceph", "virtual-machines", "image_profiles", "clustering_architecture", "resources_disk_id", "storage_lvm_stripes", "vm_boot_priority", "unix_hotplug_devices", "api_filtering", "instance_nic_network", "clustering_sizing", "firewall_driver", "projects_limits", "container_syscall_intercept_hugetlbfs", "limits_hugepages", "container_nic_routed_gateway", "projects_restrictions", "custom_volume_snapshot_expiry", "volume_snapshot_scheduling", "trust_ca_certificates", "snapshot_disk_usage", "clustering_edit_roles", "container_nic_routed_host_address", "container_nic_ipvlan_gateway", "resources_usb_pci", "resources_cpu_threads_numa", "resources_cpu_core_die", "api_os", "container_nic_routed_host_table", "container_nic_ipvlan_host_table", "container_nic_ipvlan_mode", "resources_system", "images_push_relay", "network_dns_search", "container_nic_routed_limits", "instance_nic_bridged_vlan", "network_state_bond_bridge", "usedby_consistency", "custom_block_volumes", "clustering_failure_domains", "resources_gpu_mdev", "console_vga_type", "projects_limits_disk", "network_type_macvlan", "network_type_sriov", "container_syscall_intercept_bpf_devices", "network_type_ovn", "projects_networks", "projects_networks_restricted_uplinks", "custom_volume_backup", "backup_override_name", "storage_rsync_compression", "network_type_physical", "network_ovn_external_subnets", "network_ovn_nat", "network_ovn_external_routes_remove", "tpm_device_type", "storage_zfs_clone_copy_rebase", "gpu_mdev", "resources_pci_iommu", "resources_network_usb", "resources_disk_address", "network_physical_ovn_ingress_mode", "network_ovn_dhcp", "network_physical_routes_anycast", "projects_limits_instances", "network_state_vlan", "instance_nic_bridged_port_isolation", "instance_bulk_state_change", "network_gvrp", "instance_pool_move", "gpu_sriov", "pci_device_type", "storage_volume_state", "network_acl", "migration_stateful", "disk_state_quota", "storage_ceph_features", "projects_compression", "projects_images_remote_cache_expiry", "certificate_project", "network_ovn_acl", "projects_images_auto_update", "projects_restricted_cluster_target", "images_default_architecture", "network_ovn_acl_defaults", "gpu_mig", "project_usage", "network_bridge_acl", "warnings", "projects_restricted_backups_and_snapshots", "clustering_join_token", "clustering_description", "server_trusted_proxy", "clustering_update_cert", "storage_api_project", "server_instance_driver_operational", "server_supported_storage_drivers", "event_lifecycle_requestor_address", "resources_gpu_usb", "clustering_evacuation", "network_ovn_nat_address", "network_bgp", "network_forward", "custom_volume_refresh", "network_counters_errors_dropped", "metrics", "image_source_project", "clustering_config", "network_peer", "linux_sysctl", "network_dns", "ovn_nic_acceleration", "certificate_self_renewal", "instance_project_move", "storage_volume_project_move", "cloud_init", "network_dns_nat", "database_leader", "instance_all_projects", "clustering_groups", "ceph_rbd_du", "instance_get_full", "qemu_metrics", "gpu_mig_uuid", "event_project", "clustering_evacuation_live", "instance_allow_inconsistent_copy", "network_state_ovn", "storage_volume_api_filtering", "image_restrictions", "storage_zfs_export", "network_dns_records", "storage_zfs_reserve_space", "network_acl_log", "storage_zfs_blocksize", "metrics_cpu_seconds", "instance_snapshot_never", "certificate_token", "instance_nic_routed_neighbor_probe", "event_hub", "agent_nic_config", "projects_restricted_intercept", "metrics_authentication", "images_target_project", "cluster_migration_inconsistent_copy", "cluster_ovn_chassis", "container_syscall_intercept_sched_setscheduler", "storage_lvm_thinpool_metadata_size", "storage_volume_state_total", "instance_file_head", "instances_nic_host_name", "image_copy_profile", "container_syscall_intercept_sysinfo", "clustering_evacuation_mode", "resources_pci_vpd", "qemu_raw_conf", "storage_cephfs_fscache", "network_load_balancer", "vsock_api", "instance_ready_state", "network_bgp_holdtime", "storage_volumes_all_projects", "metrics_memory_oom_total", "storage_buckets", "storage_buckets_create_credentials", "metrics_cpu_effective_total", "projects_networks_restricted_access", "storage_buckets_local", "loki", "acme", "internal_metrics", "cluster_join_token_expiry", "remote_token_expiry", "init_preseed", "storage_volumes_created_at", "cpu_hotplug", "projects_networks_zones", "network_txqueuelen", "cluster_member_state", "instances_placement_scriptlet", "storage_pool_source_wipe", "zfs_block_mode", "instance_generation_id", "disk_io_cache", "amd_sev", "storage_pool_loop_resize", "migration_vm_live", "ovn_nic_nesting", "oidc", "network_ovn_l3only", "ovn_nic_acceleration_vdpa", "cluster_healing", "instances_state_total", "auth_user", "security_csm", "instances_rebuild", "numa_cpu_placement", "custom_volume_iso", "network_allocations", "storage_api_remote_volume_snapshot_copy", "zfs_delegate", "operations_get_query_all_projects", "metadata_configuration", "syslog_socket" ], "api_status": "stable", "api_version": "1.0", "auth": "trusted", "public": false, "auth_methods": [ "tls" ], "auth_user_name": "lengau", "auth_user_method": "unix", "environment": { "addresses": [], "architectures": [ "x86_64", "i686" ], "certificate": "-----BEGIN CERTIFICATE-----\nMIIB6jCCAXCgAwIBAgIRAJ75D64MPbBI1CQn3stFZYUwCgYIKoZIzj0EAwMwJjEM\nMAoGA1UEChMDTFhEMRYwFAYDVQQDDA1yb290QGh5cGVyaW9uMB4XDTIzMDkyMDAw\nMzE1M1oXDTMzMDkxNzAwMzE1M1owJjEMMAoGA1UEChMDTFhEMRYwFAYDVQQDDA1y\nb290QGh5cGVyaW9uMHYwEAYHKoZIzj0CAQYFK4EEACIDYgAElpLK8Kfvc3SJnMUx\nj4174o2XpIIf3KGJrXy3Y8uWnJJUVFqITfC5dLAN0C3sRaqV3avx0Z96R4y/1ocN\nP1koslBdXN8eYhWvm9sEAxo99J3sPdMQ17CLfqP4a1ItSIveo2IwYDAOBgNVHQ8B\nAf8EBAMCBaAwEwYDVR0lBAwwCgYIKwYBBQUHAwEwDAYDVR0TAQH/BAIwADArBgNV\nHREEJDAigghoeXBlcmlvbocEfwAAAYcQAAAAAAAAAAAAAAAAAAAAATAKBggqhkjO\nPQQDAwNoADBlAjAsCKdQzrrrDiykSKh/7/AxC1V5Z63SapQQ2Fz5jrqE+jAVc+5h\nIhAt66ou6vSD83sCMQDUO+PVwHloSHjRUIHmzg+DHS+fiM/leIe2Y9HAiUkb16g2\ntsn5g5vj1CvqiBdowKY=\n-----END CERTIFICATE-----\n", "certificate_fingerprint": "1dff5d9c93c1b03b6825f6d734c50fe10a48eff489f66c08bdcfb7fecfb2804d", "driver": "lxc | qemu", "driver_version": "5.0.3 | 8.0.4", "firewall": "nftables", "kernel": "Linux", "kernel_architecture": "x86_64", "kernel_features": { "idmapped_mounts": "true", "netnsid_getifaddrs": "true", "seccomp_listener": "true", "seccomp_listener_continue": "true", "shiftfs": "false", "uevent_injection": "true", "unpriv_fscaps": "true" }, "kernel_version": "6.2.0-33-generic", "lxc_features": { "cgroup2": "true", "core_scheduling": "true", "devpts_fd": "true", "idmapped_mounts_v2": "true", "mount_injection_file": "true", "network_gateway_device_route": "true", "network_ipvlan": "true", "network_l2proxy": "true", "network_phys_macvlan_mtu": "true", "network_veth_router": "true", "pidfd": "true", "seccomp_allow_deny_syntax": "true", "seccomp_notify": "true", "seccomp_proxy_send_notify_fd": "true" }, "os_name": "KDE neon", "os_version": "22.04", "project": "default", "server": "lxd", "server_clustered": false, "server_event_mode": "full-mesh", "server_name": "hyperion", "server_pid": 701717, "server_version": "5.18", "storage": "dir", "storage_version": "1", "storage_supported_drivers": [ { "Name": "ceph", "Version": "17.2.6", "Remote": true }, { "Name": "cephfs", "Version": "17.2.6", "Remote": true }, { "Name": "cephobject", "Version": "17.2.6", "Remote": true }, { "Name": "dir", "Version": "1", "Remote": false }, { "Name": "lvm", "Version": "2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.47.0", "Remote": false }, { "Name": "zfs", "Version": "2.1.9-2ubuntu1.1", "Remote": false }, { "Name": "btrfs", "Version": "5.16.2", "Remote": false } ] } } Creating core20 DEBUG [2023-09-21T20:27:33-04:00] Connecting to a remote simplestreams server URL="https://cloud-images.ubuntu.com/buildd/releases" DEBUG [2023-09-21T20:27:33-04:00] Connected to the websocket: ws://unix.socket/1.0/events DEBUG [2023-09-21T20:27:33-04:00] Sending request to LXD etag= method=POST url="http://unix.socket/1.0/instances" DEBUG [2023-09-21T20:27:33-04:00] { "architecture": "", "config": {}, "devices": {}, "ephemeral": false, "profiles": null, "stateful": false, "description": "", "name": "core20", "source": { "type": "image", "certificate": "", "alias": "core20", "server": "https://cloud-images.ubuntu.com/buildd/releases", "protocol": "simplestreams", "mode": "pull", "allow_inconsistent": false }, "instance_type": "", "type": "container" } DEBUG [2023-09-21T20:27:33-04:00] Got operation from LXD DEBUG [2023-09-21T20:27:33-04:00] { "id": "ccba4370-cf41-4beb-99a8-64b33dacefe9", "class": "task", "description": "Creating instance", "created_at": "2023-09-21T20:27:33.264496564-04:00", "updated_at": "2023-09-21T20:27:33.264496564-04:00", "status": "Running", "status_code": 103, "resources": { "containers": [ "/1.0/instances/core20" ], "instances": [ "/1.0/instances/core20" ] }, "metadata": null, "may_cancel": false, "err": "", "location": "none" } DEBUG [2023-09-21T20:27:33-04:00] Sending request to LXD etag= method=GET url="http://unix.socket/1.0/operations/ccba4370-cf41-4beb-99a8-64b33dacefe9" DEBUG [2023-09-21T20:27:33-04:00] Got response struct from LXD DEBUG [2023-09-21T20:27:33-04:00] { "id": "ccba4370-cf41-4beb-99a8-64b33dacefe9", "class": "task", "description": "Creating instance", "created_at": "2023-09-21T20:27:33.264496564-04:00", "updated_at": "2023-09-21T20:27:33.264496564-04:00", "status": "Running", "status_code": 103, "resources": { "containers": [ "/1.0/instances/core20" ], "instances": [ "/1.0/instances/core20" ] }, "metadata": null, "may_cancel": false, "err": "", "location": "none" } Error: Failed instance creation: Failed getting remote image info: Failed getting image: The requested image couldn't be found ```
lxd monitor ``` $ lxc monitor location: none metadata: context: id: d5c4ffff-c858-4456-aca1-df4dd46671e0 local: /var/snap/lxd/common/lxd/unix.socket remote: '@' level: debug message: Event listener server handler started timestamp: "2023-09-21T20:29:20.204759073-04:00" type: logging location: none metadata: context: ip: '@' method: GET protocol: unix url: /1.0 username: lengau level: debug message: Handling API request timestamp: "2023-09-21T20:29:23.477504862-04:00" type: logging location: none metadata: context: ip: '@' method: GET protocol: unix url: /1.0/events username: lengau level: debug message: Handling API request timestamp: "2023-09-21T20:29:23.495407265-04:00" type: logging location: none metadata: context: id: e8f228e6-98f6-4beb-b720-89e1e2145a3e local: /var/snap/lxd/common/lxd/unix.socket remote: '@' level: debug message: Event listener server handler started timestamp: "2023-09-21T20:29:23.495616243-04:00" type: logging location: none metadata: context: ip: '@' method: POST protocol: unix url: /1.0/instances username: lengau level: debug message: Handling API request timestamp: "2023-09-21T20:29:23.496120825-04:00" type: logging location: none metadata: context: {} level: debug message: Responding to instance create timestamp: "2023-09-21T20:29:23.496134872-04:00" type: logging location: none metadata: context: instance: fitting-gazelle project: default level: debug message: No name provided for new instance, using auto-generated name timestamp: "2023-09-21T20:29:23.497448459-04:00" type: logging location: none metadata: context: class: task description: Creating instance operation: 08ba4a55-d6c3-486b-88a0-9efa97909a35 project: default level: debug message: New operation timestamp: "2023-09-21T20:29:23.501814782-04:00" type: logging location: none metadata: context: class: task description: Creating instance operation: 08ba4a55-d6c3-486b-88a0-9efa97909a35 project: default level: debug message: Started operation timestamp: "2023-09-21T20:29:23.501862279-04:00" type: logging location: none metadata: class: task created_at: "2023-09-21T20:29:23.497910641-04:00" description: Creating instance err: "" id: 08ba4a55-d6c3-486b-88a0-9efa97909a35 location: none may_cancel: false metadata: null resources: containers: - /1.0/instances/fitting-gazelle instances: - /1.0/instances/fitting-gazelle status: Pending status_code: 105 updated_at: "2023-09-21T20:29:23.497910641-04:00" project: default timestamp: "2023-09-21T20:29:23.501851538-04:00" type: operation location: none metadata: class: task created_at: "2023-09-21T20:29:23.497910641-04:00" description: Creating instance err: "" id: 08ba4a55-d6c3-486b-88a0-9efa97909a35 location: none may_cancel: false metadata: null resources: containers: - /1.0/instances/fitting-gazelle instances: - /1.0/instances/fitting-gazelle status: Running status_code: 103 updated_at: "2023-09-21T20:29:23.497910641-04:00" project: default timestamp: "2023-09-21T20:29:23.501872576-04:00" type: operation location: none metadata: context: ip: '@' method: GET protocol: unix url: /1.0/operations/08ba4a55-d6c3-486b-88a0-9efa97909a35 username: lengau level: debug message: Handling API request timestamp: "2023-09-21T20:29:23.502722785-04:00" type: logging location: none metadata: context: URL: https://cloud-images.ubuntu.com/buildd/releases level: debug message: Connecting to a remote simplestreams server timestamp: "2023-09-21T20:29:23.502234431-04:00" type: logging location: none metadata: class: task created_at: "2023-09-21T20:29:23.497910641-04:00" description: Creating instance err: 'Failed getting remote image info: Failed getting image: The requested image couldn''t be found' id: 08ba4a55-d6c3-486b-88a0-9efa97909a35 location: none may_cancel: false metadata: null resources: containers: - /1.0/instances/fitting-gazelle instances: - /1.0/instances/fitting-gazelle status: Failure status_code: 400 updated_at: "2023-09-21T20:29:23.497910641-04:00" project: default timestamp: "2023-09-21T20:29:23.506722553-04:00" type: operation location: none metadata: context: class: task description: Creating instance err: 'Failed getting remote image info: Failed getting image: The requested image couldn''t be found' operation: 08ba4a55-d6c3-486b-88a0-9efa97909a35 project: default level: debug message: Failure for operation timestamp: "2023-09-21T20:29:23.506693077-04:00" type: logging location: none metadata: context: listener: e8f228e6-98f6-4beb-b720-89e1e2145a3e local: /var/snap/lxd/common/lxd/unix.socket remote: '@' level: debug message: Event listener server handler stopped timestamp: "2023-09-21T20:29:23.509305027-04:00" type: logging ```
tomponline commented 1 year ago

Thanks for the report. We weren't aware there were any users of lxd_combined.tar.gz images. But we noticed this issue too yesterday and is fixed by https://github.com/canonical/lxd/pull/12294

I'll push into candidate today.

tomponline commented 1 year ago

It does handle images.linuxcontainers.org (as that uses lxd.tar.xz rather than lxd_combined.tar.gz), but the buildd remote is using a different image format which we couldn't test (as didn't know about its existence until snapcraft stopped working).

https://github.com/canonical/lxd/pull/12260#issuecomment-1721432929

tomponline commented 1 year ago

Thanks for testing latest/candidate so we can cherry pick this before it goes to latest/stable.

tomponline commented 1 year ago

Pushed to candidate to build now: https://github.com/canonical/lxd-pkg-snap/commit/cc9222114dc1483d2524adfdf6dae49a7417bbaf

lengau commented 1 year ago

Confirmed this is working now. Thanks!

tomponline commented 1 year ago

Thanks for confirming, sorry about the interruption.