lxc / incus

Powerful system container and virtual machine manager
https://linuxcontainers.org/incus
Apache License 2.0
2.16k stars 165 forks source link

lxd-to-incus network migration problem #868

Open killua-eu opened 1 month ago

killua-eu commented 1 month ago

Required information

config:
  core.https_address: '[::]:8443'
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_dev_incus
- migration_pre_copy
- infiniband
- dev_incus_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- dev_incus_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- images_all_projects
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- zfs_delegate
- storage_api_remote_volume_snapshot_copy
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- image_restriction_privileged
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- certificate_description
- disk_io_bus_virtio_blk
- loki_config_instance
- instance_create_start
- clustering_evacuation_stop_options
- boot_host_shutdown_action
- agent_config_drive
- network_state_ovn_lr
- image_template_permissions
- storage_bucket_backup
- storage_lvm_cluster
- shared_custom_block_volumes
- auth_tls_jwt
- oidc_claim
- device_usb_serial
- numa_cpu_balanced
- image_restriction_nesting
- network_integrations
- instance_memory_swap_bytes
- network_bridge_external_create
- network_zones_all_projects
- storage_zfs_vdev
- container_migration_stateful
- profiles_all_projects
- instances_scriptlet_get_instances
- instances_scriptlet_get_cluster_members
- network_acl_stateless
- instance_state_started_at
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: pavstra
auth_user_method: unix
environment:
  addresses:
  - 10.14.12.1:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIIFTDCCAzSgAwIBAgIQHpIoyS2zeT79wUw4iL/+EDANBgkqhkiG9w0BAQsFADA4
    MRwwGgYDVQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRgwFgYDVQQDDA9yb290QGJs
    YWNrc3BhcmswHhcNMTgwODMwMTIwNjA0WhcNMjgwODI3MTIwNjA0WjA4MRwwGgYD
    VQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRgwFgYDVQQDDA9yb290QGJsYWNrc3Bh
    cmswggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQCy4IgDYFN9R21j6OCT
    C4DAnngde8/nhp7LsUzjcPUJts5wtx0UrvhGp+ooXm632e3NNOO4TBF53XzvpUUm
    04wVBCMNc92jiPHezvRut7gphhUGRzXy3igRNymzjTbtxuCE6EuYS7FRHUAQJCd0
    lgl5s6crp42SsELwC7NhEBxcAK+B/5WFnXReJMS+JYwXtvXLydxOKmPI1kNDeKtd
    nF3i9MnbDVCZOABOKWVybe+Ut8+QKlcxaJ60GsTQHohm5W6jxzjpSIcwQ3rE7P9O
    ERvlVS5OUkDmRr2eXkS4E1pL2vY9nZxekmii1ZewYZpOK5M9ASbuHxP4oZlo72D1
    ivCscwfrF3JpB093pATYiq6Kw34nflAb9XdHWARLr+6U7gabEZhKBFj5CXrBUQ0j
    BAELJ5ofYXzYkCXzoSFBWpCg+UAA8xhMgUBwncF5iHnXBDNGZ8KHzJDKuJl5BdHl
    5Vev4fxHhiVSmQHnABbRtpxlnnnxjq5E2uDk/277nFfhDq3vqjOsuYAHdiEL8SCt
    TdflaQwZCzAwGuhBKZ+F9abjQGoIKsVCY1Eel7kvW89hvVsNNke5piJZG9NPESu6
    GmsW4lxhnrXAVS9aoN1FfyUmqKXl5NodbLz8rmU7YIBm+iSkPZUUyLMA+UVmOdi4
    aVIV7v/chozBBPFeB4P4OG7oRQIDAQABo1IwUDAOBgNVHQ8BAf8EBAMCBaAwEwYD
    VR0lBAwwCgYIKwYBBQUHAwEwDAYDVR0TAQH/BAIwADAbBgNVHREEFDASggpibGFj
    a3NwYXJrhwQKDgwBMA0GCSqGSIb3DQEBCwUAA4ICAQAu9mGpFKkrTZ0O+vD6gdDw
    +zG3o3YzUPrJPcIVcvgzU6nuXv4VKGtstNKKSUF+gJM20t0VtgVrBdNQ68n2C9to
    kKp51S1Meg0hIxCNDPgKNZ1nMh0BcNz5sXJANH33khDXU+1rlOEYQ41ksxw0fnSt
    cJLR1rhHvN84OCDUT8vAlHtVGYvD8B3QBBOGUpMFr5nt4a4LhRsHgJ/o7aJLdM0N
    1e/zUN3RlD5fffUpkkw2L3t7dj0apnIU2hI0le+VVKRdZvI9XkgBo+Vh4po7K7L9
    cJI++7QhOqcw5fVchWJJkaIg0HejtFSXvq84VA4X+ViJMO9NvdQ8cC+NxRVIGZBi
    vil63JqqMdLp4bakkByZwZ0hO1aGNhFMVrTQ+E4LNVTpw4dH7hz+KcixjGo6MY0y
    8FT+44LvrImCOEwiij+QE7Ic4hd8RI9hgvBVU7Z9kv3MnQSQTrBVOtq6lfy+pdwt
    kyu3fE4mIHgXOCbAQwyZv04FMyhY+IwfBbWVxEYqdcuH4dloVQJfWxRnlDoQX7cA
    Yf5+X8Xfoqw4ZQEnf/V+XCeSHBygPlIg6jTaMKu+IysSPqb2N8gUK0Krwgj1vKg+
    WSrgTzNd5Qguh9bBLUTCRBdqpa9DQrnb/Zh5KeQMFcwzE0HmFM5u/pzZz/1x1tIa
    XxkOLUkeagevUsYUQu7xlA==
    -----END CERTIFICATE-----
  certificate_fingerprint: fb2d5c450981ece4d65cae4f2d9f78201cb6a0c2eea973ec2ba22998e40fb3f1
  driver: lxc | qemu
  driver_version: 6.0.0 | 9.0.0
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_binfmt: "false"
    unpriv_fscaps: "true"
  kernel_version: 5.15.0-106-generic
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Ubuntu
  os_version: "22.04"
  project: default
  server: incus
  server_clustered: false
  server_event_mode: full-mesh
  server_name: blackspark.fenix.local
  server_pid: 1298
  server_version: "6.1"
  storage: dir
  storage_version: "1"
  storage_supported_drivers:
  - name: dir
    version: "1"
    remote: false
  - name: lvm
    version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.45.0
    remote: false
  - name: lvmcluster
    version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.45.0
    remote: true

Issue description

Incus cannot initialize its bridge. /var/log/incus/incusd.log

....
time="2024-05-13T23:40:58+02:00" level=error msg="Failed initializing network" err="Failed starting: The DNS and DHCP service exited prematurely: exit status 2 (\"dnsmasq: failed to create listening socket for 10.87.127.1: Address already in use\")" network=lxdbr0 project=default
....

the same error will be emitted on trying to create a new network.

killua-eu commented 1 month ago

I suspect the fact that the server has been updated from initially ubuntu 18.04 as I see some arcane networking bash scripts all over /etc

stgraber commented 1 month ago

That usually means that you have something like dnsmasq or bind9/named already listening on port 53 and conflicting with dnsmasq.

You may want to look at the netstat -lnp | grep 53 output or similar to see what's going on.

stgraber commented 1 month ago

@killua-eu have you been able to figure out what's going on? Still seems most likely to be a conflict with another DNS or DHCP server on the system.

rogtino commented 3 weeks ago

I encountered this just now when I tried to run sudo incus admin init, It warns me that

Error: Failed to create local member network "incusbr0" in project "default": The DNS and DHCP service exited prematurely: exit status 2 ("dnsmasq: failed to bind DHCP server socket: Address already in use")

But I have no process binding to 53 port. sudo netstat -lnp | grep ":53 " returns nothing.

stgraber commented 3 weeks ago

@rogtino maybe it's the DHCP side that's conflicting with something? :67?

rogtino commented 3 weeks ago

Thanks for your reply. sudo netstat -lnp | grep ":67 "returns nothing. Just FYI, I have set DNSStubListener=no in /etc/systemd/resolved.conf because there seems to be a conflict between dnsmasq and systemd-resolved.

stgraber commented 4 days ago

@killua-eu did you have any luck?

@rogtino on your end, if this is still an unresolved issue, maybe look at /var/log/syslog or the systemd journal to see if dnsmasq gives more of a hint as to what port it's conflicting with? Also a full netstat -lnp output would be useful.

Or are you saying that the stub resolver was the source of the conflict? If so, that's pretty odd as it normally runs on a specific loopback address (127.0.0.53 if I recall correctly) and so shouldn't get in the way of something trying to bind a random 10.X.Y.1 address.