canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 931 forks source link

Clarify cluster join question handling #9593

Closed ykazakov closed 2 years ago

ykazakov commented 2 years ago

Required information

$ lxc info
config:
  cluster.https_address: xxx.xx.xx.xxx:8443
  core.https_address: xxx.xx.xx.xxx:8443
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses:
  - xxx.xx.xx.xxx:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIICCjCCAY+gAwIBAgIQGIxWDmgO+tBkvSzfJPZ9EzAKBggqhkjOPQQDAzA2MRww
    GgYDVQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRYwFAYDVQQDDA1yb290QGJvbWJh
    ZGlsMB4XDTIwMDUyMjIxMTAzNFoXDTMwMDUyMDIxMTAzNFowNjEcMBoGA1UEChMT
    bGludXhjb250YWluZXJzLm9yZzEWMBQGA1UEAwwNcm9vdEBib21iYWRpbDB2MBAG
    ByqGSM49AgEGBSuBBAAiA2IABL1z0GKIvqk3ZKpuGd25ry9zzabn35T8zAiKDzlI
    8CJrTxz8jMrlBguqEiQAOlBfAkuyGkwK5R80F0CfS1rHso//I1GZk5RwjQtVDXdo
    snbVCkGblJwzF5u/H7JTcOENGqNiMGAwDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQM
    MAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwKwYDVR0RBCQwIoIIYm9tYmFkaWyH
    BH8AAAGHEAAAAAAAAAAAAAAAAAAAAAEwCgYIKoZIzj0EAwMDaQAwZgIxAIWzWdac
    IW/mH2p6VIec1ItXXuWCaNaeY2gTIUdv7N+Lv12oAKcGFHXn0/1Ue4I6IAIxAMNi
    7PLpEjQqlM3ZXiIvayLCQ9wT469aqVCtIfwo+eLgsqxVIEAj38VMZk1yUg8AaQ==
    -----END CERTIFICATE-----
  certificate_fingerprint: 494f3d4c7a10429f9f9a99c1d7472e686b07cc5a8a714e2c0518aaf9abb249c5
  driver: lxc | qemu
  driver_version: 4.0.11 | 6.1.0
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    shiftfs: "false"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 5.4.0-90-generic
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Ubuntu
  os_version: "20.04"
  project: default
  server: lxd
  server_clustered: true
  server_name: server1
  server_pid: 99290
  server_version: "4.20"
  storage: btrfs
  storage_version: 5.4.1
  storage_supported_drivers:
  - name: ceph
    version: 15.2.14
    remote: true
  - name: btrfs
    version: 5.4.1
    remote: false
  - name: cephfs
    version: 15.2.14
    remote: true
  - name: dir
    version: "1"
    remote: false
  - name: lvm
    version: 2.03.07(2) (2019-11-30) / 1.02.167 (2019-11-30) / 4.41.0
    remote: false
  - name: zfs
    version: 0.8.3-1ubuntu12.12
    remote: false

Issue description

Upon joining an existing cluster lxd asks for the size of the local storage pool. However this value does not seem to be used.

Steps to reproduce

  1. Reinstall lxd from snap:
    $ sudo snap remove lxd
    lxd removed
    $ sudo snap install lxd
    lxd 4.20 from Canonical✓ installed
  2. Prepare the btrf partition
    sudo wipefs -a /dev/ubuntu-vg/lxd-lv
    /dev/ubuntu-vg/lxd-lv: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
  3. Obtain a token on another cluster member
    $ lxc cluster add server1
    Member server1 join token:
    <TOKEN>
  4. Setup lxd:
    $ sudo lxd init
    Would you like to use LXD clustering? (yes/no) [default=no]: yes
    What IP address or DNS name should be used to reach this node? [default=xxx.xx.xx.xxx]: 
    Are you joining an existing cluster? (yes/no) [default=no]: yes
    Do you have a join token? (yes/no/[token]) [default=no]: <TOKEN>
    All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
    Choose "size" property for storage pool "local": 1TB
    Choose "source" property for storage pool "local": /dev/ubuntu-vg/lxd-lv
    Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
  5. Observe that the size of the local storage pool is different from what was entered:
    $ lxc storage info local --target server1
    info:
    description: ""
    driver: btrfs
    name: local
    space used: 3.93MB
    total space: 5.98TB

Strangely, the total and used space still does not match the size of the file system:

$ sudo mount /dev/ubuntu-vg/lxd-lv /mnt/lxd-lv/
$ df -h /mnt/lxd-lv/
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-lxd--lv  5.5T  3.8M  5.5T   1% /mnt/lxd-lv
stgraber commented 2 years ago

"lxd init" cluster questions can be a bit confusing as they're generated based on the existing config and so doesn't really allow for a curated flow like the one you get on standalone systems.

In this case the size property is asked because you could have answered the source property with an empty string, causing LXD to create a loop file for your pool which does use the size property.

However whenever passing in an existing block device or VG, that property is meaningless.

ykazakov commented 2 years ago

I see. Could it, perhaps, make sense to allow some sensible default value, e.g., 0 = Autodetect? I do not think many would create a cluster with a loop file.

stgraber commented 2 years ago

Well, except for all of the ones that are built in VMs for testing where they'll usually default to ZFS on loop and so will use the property :)

There effectively are defaults for most if not all of those, you can leave them empty, they just may not be defaults that you like! In this case leaving both of them empty would have resulted in a loop backed LVM pool with the size using LXD's default size (percentage of available space with a fixed minimum/maximum allocation).