lxc / incus

Powerful system container and virtual machine manager
https://linuxcontainers.org/incus
Apache License 2.0
2.26k stars 173 forks source link

incus-migrate should detect non-raw images #658

Closed candlerb closed 4 months ago

candlerb commented 4 months ago

Required information

# incus info
config:
  core.https_address: 100.64.0.1:8443
  core.metrics_address: :8444
  core.metrics_authentication: "false"
  images.auto_update_interval: "0"
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_dev_incus
- migration_pre_copy
- infiniband
- dev_incus_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- dev_incus_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- images_all_projects
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- zfs_delegate
- storage_api_remote_volume_snapshot_copy
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- image_restriction_privileged
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- certificate_description
- disk_io_bus_virtio_blk
- loki_config_instance
- instance_create_start
- clustering_evacuation_stop_options
- boot_host_shutdown_action
- agent_config_drive
- network_state_ovn_lr
- image_template_permissions
- storage_bucket_backup
- storage_lvm_cluster
- shared_custom_block_volumes
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: root
auth_user_method: unix
environment:
  addresses:
  - 100.64.0.1:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
   <SNIP>
    -----END CERTIFICATE-----
  certificate_fingerprint: 32f7327dbd179a984790bd74eab34167d1becf1bd81dfbaca7a9d6e8e1a0bc9c
  driver: qemu | lxc
  driver_version: 8.2.1 | 5.0.3
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_binfmt: "false"
    unpriv_fscaps: "true"
  kernel_version: 5.15.0-101-generic
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Ubuntu
  os_version: "22.04"
  project: default
  server: incus
  server_clustered: false
  server_event_mode: full-mesh
  server_name: brian-kit
  server_pid: 1699
  server_version: "0.6"
  storage: lvm | zfs | dir
  storage_version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.45.0 | 2.1.5-1ubuntu6~22.04.3
    | 1
  storage_supported_drivers:
  - name: lvm
    version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.45.0
    remote: false
  - name: lvmcluster
    version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.45.0
    remote: true
  - name: zfs
    version: 2.1.5-1ubuntu6~22.04.3
    remote: false
  - name: btrfs
    version: 5.16.2
    remote: false
  - name: dir
    version: "1"
    remote: false

Issue description

When you pass a qcow2 image to bin.linux.incus-migrate.x86_64, AFAICT it treats it as a raw image. At least, the symptom is that the image is quietly accepted and migration appears to be successful, but it won't boot.

Steps to reproduce

$ cd /var/tmp
$ wget https://cloud-images.ubuntu.com/releases/jammy/release-20240319/ubuntu-22.04-server-cloudimg-amd64-disk-kvm.img
...
$ file ubuntu-22.04-server-cloudimg-amd64-disk-kvm.img
ubuntu-22.04-server-cloudimg-amd64-disk-kvm.img: QEMU QCOW2 Image (v2), 2361393152 bytes

Now migrate:

$ ~/bin.linux.incus-migrate.x86_64 --version
0.6
$ sudo ~/bin.linux.incus-migrate.x86_64
Please provide Incus server URL: https://100.64.0.1:8443
Certificate fingerprint: 32f7327dbd179a984790bd74eab34167d1becf1bd81dfbaca7a9d6e8e1a0bc9c
ok (y/n)? y

1) Use a certificate token
2) Use an existing TLS authentication certificate
3) Generate a temporary TLS authentication certificate
Please pick an authentication mechanism above: 1
Please provide the certificate token:
<On another console: "incus config trust add temp">
<Paste the result here>

Remote server:
  Hostname: brian-kit
  Version: 0.6

Would you like to create a container (1) or virtual-machine (2)?: 2
Name of the new instance: ubucloud
Please provide the path to a disk, partition, or image file: ubuntu-22.04-server-cloudimg-amd64-disk-kvm.img
Does the VM support UEFI Secure Boot? [default=no]:

Instance to be created:
  Name: ubucloud
  Project: default
  Type: virtual-machine
  Source: ubuntu-22.04-server-cloudimg-amd64-disk-kvm.img
  Config:
    security.secureboot: "false"

Additional overrides can be applied at this stage:
1) Begin the migration with the above configuration
2) Override profile list
3) Set additional configuration options
4) Change instance storage pool or volume size
5) Change instance network

Please pick one of the options above [default=1]: 1
Instance ubucloud successfully created

Now attempt to boot:

$ incus start --console ubucloud
BdsDxe: failed to load Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1): Not Found

>>Start PXE over IPv4.
... goes on to try PXE over v6, HTTP over v4 and v6, before giving up

Check config:

$ incus config show -e ubucloud
architecture: x86_64
config:
  security.secureboot: "false"
  volatile.cloud-init.instance-id: 49d18443-de8c-4acb-a60b-07c50c65e785
  volatile.eth0.hwaddr: 00:16:3e:00:ea:86
  volatile.last_state.power: STOPPED
  volatile.last_state.ready: "false"
  volatile.uuid: 06bed413-b3a3-455d-bbed-1c19a9e9ee0e
  volatile.uuid.generation: 06bed413-b3a3-455d-bbed-1c19a9e9ee0e
  volatile.vsock_id: "3317575896"
devices:
  eth0:
    name: eth0
    network: incusbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Then I repeated the process, but converting to a raw image first:

$ incus delete ubucloud
$ qemu-img convert -O raw ubuntu-22.04-server-cloudimg-amd64-disk-kvm.img ubuntu-22.04-server-cloudimg-amd64-disk-kvm.raw
$ sudo ~/bin.linux.incus-migrate.x86_64
...
Please provide the path to a disk, partition, or image file: ubuntu-22.04-server-cloudimg-amd64-disk-kvm.raw
...
$ incus start --console ubucloud
BdsDxe: loading Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
BdsDxe: starting Boot0001 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0x1,0x1)/Pci(0x0,0x0)/Scsi(0x0,0x1)
GRUB_FORCE_PARTUUID set, attempting initrdless boot.
Linux version 5.15.0-1051-kvm (buildd@lcy02-amd64-091) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #56-Ubuntu SMP Thu Feb 8 23:30:16 UTC 2024 (Ubuntu 5.15.0-1051.56-kvm 5.15.136)
Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1051-kvm root=PARTUUID=014d4b32-e1b9-4b90-8b5e-0d8883621900 ro console=tty1 console=ttyS0 panic=-1
... etc

Observation

At minimum, incus-migrate could provide a warning message saying that any image provided must be raw.

Ideally it should detect these other image formats, and either offer to convert them to raw, or reject.

Information to attach

stgraber commented 4 months ago

Detection isn't really an option as we may need to read a significant amount of data to determine that and it would quite significantly increase the complexity of what's supposed to be a very small tool that can be built and distributed statically.

But we can certainly make the prompt clearer about image files.

candlerb commented 4 months ago

Detection isn't really an option as we may need to read a significant amount of data to determine that

Not very much I think. magic.mgc is able to detect it:

$ file /home/nsrc/nsrc-vm/output/nsrc-nmm.qcow2
/home/nsrc/nsrc-vm/output/nsrc-nmm.qcow2: QEMU QCOW2 Image (v3), 53687091200 bytes
$

Or:

$ qemu-img info /home/nsrc/nsrc-vm/output/nsrc-nmm.qcow2
image: /home/nsrc/nsrc-vm/output/nsrc-nmm.qcow2
file format: qcow2
virtual size: 50 GiB (53687091200 bytes)
disk size: 2.08 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false
$