canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 931 forks source link

Unaccounted raft node(s) not found in 'nodes' table for heartbeat: #6313

Closed turtle0x1 closed 5 years ago

turtle0x1 commented 5 years ago

Required information

root@cluster-node1:~# lxc info config: cluster.https_address: 10.233.30.54:8443 core.https_address: 10.233.30.54:8443 core.trust_password: true api_extensions:

  • storage_zfs_remove_snapshots
  • container_host_shutdown_timeout
  • container_stop_priority
  • container_syscall_filtering
  • auth_pki
  • container_last_used_at
  • etag
  • patch
  • usb_devices
  • https_allowed_credentials
  • image_compression_algorithm
  • directory_manipulation
  • container_cpu_time
  • storage_zfs_use_refquota
  • storage_lvm_mount_options
  • network
  • profile_usedby
  • container_push
  • container_exec_recording
  • certificate_update
  • container_exec_signal_handling
  • gpu_devices
  • container_image_properties
  • migration_progress
  • id_map
  • network_firewall_filtering
  • network_routes
  • storage
  • file_delete
  • file_append
  • network_dhcp_expiry
  • storage_lvm_vg_rename
  • storage_lvm_thinpool_rename
  • network_vlan
  • image_create_aliases
  • container_stateless_copy
  • container_only_migration
  • storage_zfs_clone_copy
  • unix_device_rename
  • storage_lvm_use_thinpool
  • storage_rsync_bwlimit
  • network_vxlan_interface
  • storage_btrfs_mount_options
  • entity_description
  • image_force_refresh
  • storage_lvm_lv_resizing
  • id_map_base
  • file_symlinks
  • container_push_target
  • network_vlan_physical
  • storage_images_delete
  • container_edit_metadata
  • container_snapshot_stateful_migration
  • storage_driver_ceph
  • storage_ceph_user_name
  • resource_limits
  • storage_volatile_initial_source
  • storage_ceph_force_osd_reuse
  • storage_block_filesystem_btrfs
  • resources
  • kernel_limits
  • storage_api_volume_rename
  • macaroon_authentication
  • network_sriov
  • console
  • restrict_devlxd
  • migration_pre_copy
  • infiniband
  • maas_network
  • devlxd_events
  • proxy
  • network_dhcp_gateway
  • file_get_symlink
  • network_leases
  • unix_device_hotplug
  • storage_api_local_volume_handling
  • operation_description
  • clustering
  • event_lifecycle
  • storage_api_remote_volume_handling
  • nvidia_runtime
  • container_mount_propagation
  • container_backup
  • devlxd_images
  • container_local_cross_pool_handling
  • proxy_unix
  • proxy_udp
  • clustering_join
  • proxy_tcp_udp_multi_port_handling
  • network_state
  • proxy_unix_dac_properties
  • container_protection_delete
  • unix_priv_drop
  • pprof_http
  • proxy_haproxy_protocol
  • network_hwaddr
  • proxy_nat
  • network_nat_order
  • container_full
  • candid_authentication
  • backup_compression
  • candid_config
  • nvidia_runtime_config
  • storage_api_volume_snapshots
  • storage_unmapped
  • projects
  • candid_config_key
  • network_vxlan_ttl
  • container_incremental_copy
  • usb_optional_vendorid
  • snapshot_scheduling
  • container_copy_project
  • clustering_server_address
  • clustering_image_replication
  • container_protection_shift
  • snapshot_expiry
  • container_backup_override_pool
  • snapshot_expiry_creation
  • network_leases_location
  • resources_cpu_socket
  • resources_gpu
  • resources_numa
  • kernel_features
  • id_map_current
  • event_location
  • storage_api_remote_volume_snapshots
  • network_nat_address
  • container_nic_routes
  • rbac
  • cluster_internal_copy
  • seccomp_notify
  • lxc_features
  • container_nic_ipvlan
  • network_vlan_sriov
  • storage_cephfs
  • container_nic_ipfilter
  • resources_v2
  • container_exec_user_group_cwd
  • container_syscall_intercept
  • container_disk_shift
  • storage_shifted
  • resources_infiniband
  • daemon_storage
  • instances
  • image_types
  • resources_disk_sata
  • clustering_roles
  • images_expiry api_status: stable api_version: "1.0" auth: trusted public: false auth_methods:
  • tls environment: addresses:
    • 10.233.30.54:8443 architectures:
    • x86_64
    • i686 certificate: | -----BEGIN CERTIFICATE----- -----END CERTIFICATE----- certificate_fingerprint: ae10f02e068851f7fd3f808ee6a6452822cc6a6ed11e17f4b3dcccf0d8a9bfad driver: lxc driver_version: 3.2.1 kernel: Linux kernel_architecture: x86_64 kernel_features: netnsid_getifaddrs: "true" seccomp_listener: "true" shiftfs: "false" uevent_injection: "true" unpriv_fscaps: "true" kernel_version: 5.0.0-29-generic lxc_features: mount_injection_file: "true" network_gateway_device_route: "true" network_ipvlan: "true" network_l2proxy: "true" network_phys_macvlan_mtu: "true" seccomp_notify: "true" project: default server: lxd server_clustered: true server_name: cluster-node1 server_pid: 594 server_version: "3.18" storage: dir storage_version: "1"

Issue description

When i try to create a cluster in LXD containers I seem to keep running into the below when I try to add the second node

Oct 14 20:40:30 cluster-node1 lxd.daemon[473]: t=2019-10-14T20:40:30+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]"
stgraber commented 5 years ago

Does that prevent you from joining the node?

stgraber commented 5 years ago

And how exactly did you trigger the join? lxd init or direct API?

turtle0x1 commented 5 years ago

Yes it does and lxd init

stgraber commented 5 years ago

Did you have it fail to join earlier for another reason?

stgraber commented 5 years ago

On the working node, can you show lxd sql global "SELECT * FROM nodes;" and lxc cluster list?

turtle0x1 commented 5 years ago
root@cluster-node1:~# lxd sql global "SELECT * FROM nodes;"
+----+---------------+-------------+--------------------+--------+----------------+--------------------------------+---------+
| id |     name      | description |      address       | schema | api_extensions |           heartbeat            | pending |
+----+---------------+-------------+--------------------+--------+----------------+--------------------------------+---------+
| 1  | cluster-node1 |             | 10.233.30.54:8443  | 18     | 148            | 2019-10-14T21:24:31.061676485Z | 0       |
| 2  | cluster-node2 |             | 10.233.30.169:8443 | 18     | 148            | 2019-10-14T20:34:52Z           | 1       |
+----+---------------+-------------+--------------------+--------+----------------+--------------------------------+---------+
root@cluster-node1:~# lxc cluster list
+---------------+---------------------------+----------+--------+-------------------+
|     NAME      |            URL            | DATABASE | STATE  |      MESSAGE      |
+---------------+---------------------------+----------+--------+-------------------+
| cluster-node1 | https://10.233.30.54:8443 | YES      | ONLINE | fully operational |
+---------------+---------------------------+----------+--------+-------------------+
root@cluster-node1:~# 

It may be that the client just hangs ?

stgraber commented 5 years ago

Ok, so that does suggest that you had a partial failure to join which is now making a bit of a mess. Can you try lxd sql global "DELETE FROM nodes WHERE id=2;" and then try to join again, see if you get a different error then?

stgraber commented 5 years ago

We probably ought to have LXD delete any pending join when attempting to join again, but that's probably not the root of this issue.

turtle0x1 commented 5 years ago

It kinda completly messes up my hosts lxd it cant really stop containers

stgraber commented 5 years ago

Hmm, can you provide lxd.log for both servers?

I wonder just how far it went into joining before it failed, if it got to joining the clustered database, that could be a problem indeed.

turtle0x1 commented 5 years ago

From node 1

t=2019-10-14T20:33:35+0000 lvl=info msg="LXD 3.18 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2019-10-14T20:33:35+0000 lvl=info msg="Kernel uid/gid map:" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - u 0 1000000 1000000000" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - g 0 1000000 1000000000" 
t=2019-10-14T20:33:35+0000 lvl=info msg="Configured LXD uid/gid map:" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - u 0 100000 999900000" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - g 0 100000 999900000" 
t=2019-10-14T20:33:35+0000 lvl=warn msg="Couldn't find the CGroup blkio.weight, I/O weight limits will be ignored." 
t=2019-10-14T20:33:35+0000 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." 
t=2019-10-14T20:33:35+0000 lvl=info msg="Kernel features:" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - netnsid-based network retrieval: yes" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - uevent injection: yes" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - seccomp listener: yes" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - unprivileged file capabilities: yes" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - shiftfs support: no" 
t=2019-10-14T20:33:35+0000 lvl=info msg="Initializing local database" 
t=2019-10-14T20:33:35+0000 lvl=info msg="Starting /dev/lxd handler:" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2019-10-14T20:33:35+0000 lvl=info msg="REST API daemon:" 
t=2019-10-14T20:33:35+0000 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2019-10-14T20:33:35+0000 lvl=info msg="Initializing global database" 
t=2019-10-14T20:33:37+0000 lvl=info msg="Initializing storage pools" 
t=2019-10-14T20:33:37+0000 lvl=info msg="Initializing networks" 
t=2019-10-14T20:33:37+0000 lvl=info msg="Pruning leftover image files" 
t=2019-10-14T20:33:37+0000 lvl=info msg="Done pruning leftover image files" 
t=2019-10-14T20:33:37+0000 lvl=info msg="Loading daemon configuration" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Started seccomp handler" path=/var/snap/lxd/common/lxd/seccomp.socket
t=2019-10-14T20:33:38+0000 lvl=info msg="Pruning expired images" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Done pruning expired images" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Pruning expired container backups" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Done pruning expired container backups" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Expiring log files" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Done expiring log files" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Updating images" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Updating instance types" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Done updating images" 
t=2019-10-14T20:33:38+0000 lvl=info msg="Done updating instance types" 
t=2019-10-14T20:34:00+0000 lvl=info msg="Update network address" 
t=2019-10-14T20:34:00+0000 lvl=info msg=" - binding TCP socket" socket=10.233.30.54:8443
t=2019-10-14T20:34:00+0000 lvl=info msg="Update cluster address" 
t=2019-10-14T20:34:00+0000 lvl=info msg="Creating DIR storage pool \"local\"" 
t=2019-10-14T20:34:00+0000 lvl=info msg="Created DIR storage pool \"local\"" 
t=2019-10-14T20:34:00+0000 lvl=eror msg="Failed to get leader node address: Node is not clustered" 
t=2019-10-14T20:34:04+0000 lvl=info msg="Refreshing forkdns peers for lxdfan0" 
t=2019-10-14T20:34:57+0000 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@000b3: use of closed network connection" 
t=2019-10-14T20:35:00+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:35:10+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:35:20+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:35:30+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:35:40+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:35:50+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:36:00+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:36:10+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:36:20+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:36:30+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:36:40+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:36:50+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:37:00+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48500
t=2019-10-14T20:37:00+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48510
t=2019-10-14T20:37:00+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:37:01+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48520
t=2019-10-14T20:37:01+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48530
t=2019-10-14T20:37:01+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48540
t=2019-10-14T20:37:02+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48550
t=2019-10-14T20:37:02+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48560
t=2019-10-14T20:37:02+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48570
t=2019-10-14T20:37:03+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48580
t=2019-10-14T20:37:03+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48590
t=2019-10-14T20:37:03+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48600
t=2019-10-14T20:37:04+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48610
t=2019-10-14T20:37:04+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48620
t=2019-10-14T20:37:05+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48630
t=2019-10-14T20:37:05+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48640
t=2019-10-14T20:37:05+0000 lvl=warn msg="Rejecting request from untrusted client" ip=10.233.30.1:48650
t=2019-10-14T20:37:10+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:37:20+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:37:30+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:37:40+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:37:50+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 
t=2019-10-14T20:38:00+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.169:8443:{ID:2 Address:10.233.30.169:8443}]" 

From node 2

t=2019-10-14T20:34:11+0000 lvl=info msg="LXD 3.18 is starting in normal mode" path=/var/snap/lxd/common/lxd
t=2019-10-14T20:34:11+0000 lvl=info msg="Kernel uid/gid map:" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - u 0 1000000 1000000000" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - g 0 1000000 1000000000" 
t=2019-10-14T20:34:11+0000 lvl=info msg="Configured LXD uid/gid map:" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - u 0 100000 999900000" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - g 0 100000 999900000" 
t=2019-10-14T20:34:11+0000 lvl=warn msg="Couldn't find the CGroup blkio.weight, I/O weight limits will be ignored." 
t=2019-10-14T20:34:11+0000 lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." 
t=2019-10-14T20:34:11+0000 lvl=info msg="Kernel features:" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - netnsid-based network retrieval: yes" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - uevent injection: yes" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - seccomp listener: yes" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - unprivileged file capabilities: yes" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - shiftfs support: no" 
t=2019-10-14T20:34:11+0000 lvl=info msg="Initializing local database" 
t=2019-10-14T20:34:11+0000 lvl=info msg="Starting /dev/lxd handler:" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock
t=2019-10-14T20:34:11+0000 lvl=info msg="REST API daemon:" 
t=2019-10-14T20:34:11+0000 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket
t=2019-10-14T20:34:11+0000 lvl=info msg="Initializing global database" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Initializing storage pools" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Initializing networks" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Pruning leftover image files" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Done pruning leftover image files" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Loading daemon configuration" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Started seccomp handler" path=/var/snap/lxd/common/lxd/seccomp.socket
t=2019-10-14T20:34:15+0000 lvl=info msg="Pruning expired images" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Done pruning expired images" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Pruning expired container backups" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Done pruning expired container backups" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Updating instance types" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Updating images" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Expiring log files" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Done updating images" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Done expiring log files" 
t=2019-10-14T20:34:15+0000 lvl=info msg="Done updating instance types" 
t=2019-10-14T20:34:50+0000 lvl=info msg="Update network address" 
t=2019-10-14T20:34:50+0000 lvl=info msg=" - binding TCP socket" socket=10.233.30.169:8443
t=2019-10-14T20:34:51+0000 lvl=info msg="Creating DIR storage pool \"local\"" 
t=2019-10-14T20:34:51+0000 lvl=info msg="Created DIR storage pool \"local\"" 
t=2019-10-14T20:34:52+0000 lvl=eror msg="Failed to get leader node address: Node is not clustered" 
t=2019-10-14T20:34:52+0000 lvl=info msg="Joining dqlite raft cluster" address=10.233.30.169:8443 id=2 target=10.233.30.169:8443
t=2019-10-14T20:34:57+0000 lvl=warn msg="Dqlite client proxy TLS -> Unix: read tcp 10.233.30.169:41364->10.233.30.54:8443: use of closed network connection" 
t=2019-10-14T21:34:54+0000 lvl=info msg="Asked to shutdown by API, shutting down containers" 
stgraber commented 5 years ago

Is LXD still responding properly on node1?

Node2 is likely to be pretty confused about its state right now. You mentioned running containers, I'm assuming this wasn't on node2 as new cluster nodes must be empty.

turtle0x1 commented 5 years ago
root@cluster-node1:~# lxc list
Error: cannot fetch node config from database: driver: bad connection

No these two nodes are running in containers

stgraber commented 5 years ago

Hmm, I'm confused, so you had containers running on node2 and then tried to have it join the cluster?

turtle0x1 commented 5 years ago

Something like this:

My PC Using lxd -> cluster-node1 -> cluster-node2

There were no containers running on either cluster-node1 or cluster-node2

3.0.3 didn't have this problem forming clusters

stgraber commented 5 years ago

Ah, okay, that makes more sense.

So the issue is really about how far things went before things failed. The worst case scenario is if LXD managed to replicate the database to node2, increased the database quorum to 2 and then somehow failed, leaving the node in in pending state but the DB dependent on the second node.

stgraber commented 5 years ago

What I'd really like to get is the initial join error though as that can't have been the Unaccounted raft ... one, that particular error is only possible on subsequent join failure.

If it's just a test cluster with nothing on it, can you reset both containers and recreate the original failure?

stgraber commented 5 years ago

I'm currently looking at some other failure cases during join to try to make things a bit harder to break, especially around mismatch in version and the like.

turtle0x1 commented 5 years ago

Sorry I need to go now (late here) but im pretty confident I can re-create this error tomorrow

stgraber commented 5 years ago

Thanks, that'd be great to have that error!

turtle0x1 commented 5 years ago

How to duplicate

On Host

Default profile

config:
  security.nesting: "true"
description: Default LXD profile
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
name: default
used_by:
- /1.0/containers/node1
- /1.0/containers/node2
- /1.0/containers/node3

Container setup


lxc launch ubuntu: node1 -p default 
lxc launch ubuntu: node2 -p default 
lxc launch ubuntu: node3 -p default 

## On each node

lxc exec node bash
snap install lxd
lxd.migate 
=> Nothing here want to uninstall deb ubuntu y/n 
=> uninstalled

lxc --version 
=> 3.18

lxd init
=> See configs below

Node 1 setup

config:
  core.https_address: 10.233.30.213:8443
  core.trust_password: clusterPassword
networks:
- config:
    bridge.mode: fan
    fan.underlay_subnet: auto
  description: ""
  managed: false
  name: lxdfan0
  type: ""
storage_pools:
- config: {}
  description: ""
  name: local
  driver: dir
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      nictype: bridged
      parent: lxdfan0
      type: nic
    root:
      path: /
      pool: local
      type: disk
  name: default
cluster:
  server_name: node1
  enabled: true
  member_config: []
  cluster_address: ""
  cluster_certificate: ""
  server_address: ""
  cluster_password: ""

Node 2 setup

config: {}
networks: []
storage_pools: []
profiles: []
cluster:
  server_name: node2
  enabled: true
  member_config:
  - entity: storage-pool
    name: local
    key: source
    value: ""
    description: '"source" property for storage pool "local"'
  cluster_address: 10.233.30.213:8443
  cluster_certificate: |
    -----BEGIN CERTIFICATE-----
Some cert
    -----END CERTIFICATE-----
  server_address: 10.233.30.75:8443
  cluster_password: clusterPassword

It hangs after the you print the config for node 2

The logs I think are intreseting

Oct 17 17:12:44 node1 lxd.daemon[1283]: t=2019-10-17T17:12:44+0000 lvl=warn msg="Dqlite client proxy TLS -> Unix: read tcp 10.233.30.213:33698->10.233.30.213:8443: use of closed network connection"
Oct 17 17:12:44 node1 lxd.daemon[1283]: t=2019-10-17T17:12:44+0000 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@000af: use of closed network connection"
Oct 17 17:13:37 node1 ovs-vsctl: ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)
Oct 17 17:13:37 node1 ovs-vsctl: message repeated 2 times: [ ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)]
Oct 17 17:13:39 node1 lxd.daemon[1283]: t=2019-10-17T17:13:39+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.75:8443:{ID:2 Address:10.233.30.75:8443}]"
Oct 17 17:13:43 node1 lxd.daemon[1283]: t=2019-10-17T17:13:43+0000 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@000af: use of closed network connection"
Oct 17 17:13:49 node1 lxd.daemon[1283]: t=2019-10-17T17:13:49+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.75:8443:{ID:2 Address:10.233.30.75:8443}]"
Oct 17 17:13:59 node1 lxd.daemon[1283]: t=2019-10-17T17:13:59+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.233.30.75:8443:{ID:2 Address:10.233.30.75:8443}]"

The whole log up until the log above

Oct 17 17:12:13 node1 systemd[1]: system.slice: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: cron.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: console-setup.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-resolved.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: Failed to set devices.allow on /system.slice/systemd-resolved.service: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: message repeated 7 times: [ Failed to set devices.allow on /system.slice/systemd-resolved.service: Operation not permitted]
Oct 17 17:12:13 node1 systemd[1]: ebtables.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: cloud-final.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-user-sessions.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-sysctl.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: unattended-upgrades.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-networkd.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-udev-trigger.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: snapd.socket: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: polkit.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: apport.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: networkd-dispatcher.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-journald.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-tmpfiles-setup.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: blk-availability.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: sys-fs-fuse-connections.mount: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: snapd.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: console-getty.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-update-utmp.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-tmpfiles-setup-dev.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: snap-core-7917.mount: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: ssh.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: rsyslog.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-udevd.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: apparmor.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-networkd-wait-online.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: snap.lxd.daemon.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: ufw.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: accounts-daemon.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: atd.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-journal-flush.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: cloud-init-local.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: snap.mount: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: cloud-init.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: snap-lxd-12211.mount: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: cloud-config.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: user.slice: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: dbus.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: keyboard-setup.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: system-getty.slice: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: snapd.seeded.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:13 node1 systemd[1]: systemd-logind.service: Failed to reset devices.list: Operation not permitted
Oct 17 17:12:38 node1 networkd-dispatcher[209]: WARNING:Unknown index 2 seen, reloading interface list
Oct 17 17:12:38 node1 systemd-udevd[2461]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 17 17:12:38 node1 systemd-udevd[2461]: Could not generate persistent MAC address for lxdfan0: No such file or directory
Oct 17 17:12:38 node1 networkd-dispatcher[209]: WARNING:Unknown index 3 seen, reloading interface list
Oct 17 17:12:38 node1 systemd-udevd[2464]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 17 17:12:38 node1 systemd-udevd[2464]: Could not generate persistent MAC address for lxdfan0-mtu: No such file or directory
Oct 17 17:12:38 node1 systemd-networkd[180]: lxdfan0-mtu: Gained carrier
Oct 17 17:12:38 node1 systemd-networkd[180]: lxdfan0-mtu: Gained IPv6LL
Oct 17 17:12:38 node1 networkd-dispatcher[209]: WARNING:Unknown index 4 seen, reloading interface list
Oct 17 17:12:38 node1 systemd-udevd[2503]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 17 17:12:38 node1 systemd-udevd[2503]: Could not generate persistent MAC address for lxdfan0-fan: No such file or directory
Oct 17 17:12:38 node1 systemd-networkd[180]: lxdfan0-fan: Gained carrier
Oct 17 17:12:39 node1 dnsmasq[2515]: started, version 2.75 cachesize 150
Oct 17 17:12:39 node1 dnsmasq[2515]: compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC loop-detect inotify
Oct 17 17:12:39 node1 dnsmasq[2515]: LOUD WARNING: listening on 240.213.0.1 may accept requests via interfaces other than lxdfan0
Oct 17 17:12:39 node1 dnsmasq[2515]: LOUD WARNING: use --bind-dynamic rather than --bind-interfaces to avoid DNS amplification attacks via these interface(s)
Oct 17 17:12:39 node1 dnsmasq-dhcp[2515]: DHCP, IP range 240.213.0.2 -- 240.213.0.254, lease time 1h
Oct 17 17:12:39 node1 dnsmasq-dhcp[2515]: DHCP, sockets bound exclusively to interface lxdfan0
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain 240.in-addr.arpa
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain lxd
Oct 17 17:12:39 node1 dnsmasq[2515]: reading /etc/resolv.conf
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain 240.in-addr.arpa
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain lxd
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 127.0.0.53#53
Oct 17 17:12:39 node1 dnsmasq[2515]: read /etc/hosts - 7 addresses
Oct 17 17:12:39 node1 dnsmasq[2515]: read /etc/hosts - 7 addresses
Oct 17 17:12:39 node1 lxd-forkdns[2520]: t=2019-10-17T17:12:39+0000 lvl=info msg=Started
Oct 17 17:12:39 node1 lxd-forkdns[2520]: t=2019-10-17T17:12:39+0000 lvl=info msg="Server list loaded: []"
Oct 17 17:12:39 node1 lxd.daemon[1283]: t=2019-10-17T17:12:39+0000 lvl=eror msg="Failed to get leader node address: Node is not clustered"
Oct 17 17:12:39 node1 systemd-networkd[180]: lxdfan0: Gained carrier
Oct 17 17:12:39 node1 dnsmasq[2515]: reading /etc/resolv.conf
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain 240.in-addr.arpa
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain lxd
Oct 17 17:12:39 node1 dnsmasq[2515]: using nameserver 127.0.0.53#53
Oct 17 17:12:40 node1 systemd-networkd[180]: lxdfan0-fan: Gained IPv6LL
Oct 17 17:12:40 node1 dnsmasq[2515]: reading /etc/resolv.conf
Oct 17 17:12:40 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain 240.in-addr.arpa
Oct 17 17:12:40 node1 dnsmasq[2515]: using nameserver 240.213.0.1#1053 for domain lxd
Oct 17 17:12:40 node1 dnsmasq[2515]: using nameserver 127.0.0.53#53
Oct 17 17:12:41 node1 systemd-networkd[180]: lxdfan0: Gained IPv6LL

Node 1

root@node1:~# lxd sql global "select * from Nodes;";
+----+-------+-------------+--------------------+--------+----------------+--------------------------------+---------+
| id | name  | description |      address       | schema | api_extensions |           heartbeat            | pending |
+----+-------+-------------+--------------------+--------+----------------+--------------------------------+---------+
| 1  | node1 |             | 10.233.30.213:8443 | 18     | 148            | 2019-10-17T17:31:39.278363379Z | 0       |
| 2  | node2 |             | 10.233.30.75:8443  | 18     | 148            | 2019-10-17T17:13:37Z           | 1       |
+----+-------+-------------+--------------------+--------+----------------+--------------------------------+---------+
stgraber commented 5 years ago

Thanks. I'm on vacation this week but will take another look at this early next week.

turtle0x1 commented 5 years ago

Enjoy 😊

stgraber commented 5 years ago

@turtle0x1 can you easily test with the edge build (snap install lxd --edge)?

I've done some initial testing and edge doesn't appear affected, that or I just got lucky.

turtle0x1 commented 5 years ago

Yeah, it will be a couple of days at this point though!

stgraber commented 5 years ago

What's odd is that we have daily tests for clustering which pretty much does the exact same thing you are and I don't believe I've seen such hangs in those tests, so I wonder what the actual condition is which causes that issue.

turtle0x1 commented 5 years ago

It could be lots of things, messing around with clustering required alot of purging lxd and re installing

stgraber commented 5 years ago

@turtle0x1 did you have a chance to poke at this some more?

turtle0x1 commented 5 years ago

No sorry, i moved on clustering is wayy to annoying! We can close this if you want

forest-official commented 5 years ago

I'm running into the exact same problem while installing new LXD cluster with the same ansible scripts i've been using for a long time now.

Having this problem when installing from 3.17/stable & 3.18/stable

Trying edge now. Will report back.

stgraber commented 5 years ago

Interesting, I've seen this hit once but haven't been able to reproduce it on edge, it's also never hit our daily cluster testing which automatically assembles a 8 nodes cluster using something very similar to what @turtle0x1 described. If we can get some kind of reliable reproducer we can use on our end, then we can track this down.

The message Unaccounted raft nodes... isn't really an issue by itself, it just indicates that a raft member was added that isn't yet listed in the global database. This is effectively just a race between the hearbeat sending function and the join/promotion logic, the message is logged so we can track down inconsistencies longer term, but isn't itself an error and so shouldn't cause the hangs described here.

forest-official commented 5 years ago

Alright, edge is reporting the same problems. Both node2 & node3 are joining node1 cluster via lxd init --preseed

Node1:

t=2019-10-29T14:34:57+0000 lvl=info msg="Creating ZFS storage pool \"zfs\"" 
t=2019-10-29T14:35:00+0000 lvl=info msg="Created ZFS storage pool \"zfs\"" 
t=2019-10-29T14:35:01+0000 lvl=info msg="Creating CEPH OSD storage pool \"ceph\" in cluster \"ceph\"" 
t=2019-10-29T14:35:06+0000 lvl=info msg="Created CEPH OSD storage pool \"ceph\" in cluster \"ceph\"" 
t=2019-10-29T14:35:06+0000 lvl=eror msg="Failed to get leader node address: Node is not clustered" 
t=2019-10-29T14:35:12+0000 lvl=warn msg="Dqlite client proxy TLS -> Unix: read tcp 10.254.0.4:50258->10.254.0.4:8443: use of closed network connection" 
t=2019-10-29T14:35:12+0000 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@0001e: use of closed network connection" 
t=2019-10-29T14:35:56+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 
t=2019-10-29T14:35:57+0000 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@0001e: use of closed network connection" 
t=2019-10-29T14:35:57+0000 lvl=eror msg="Empty raft node set received" 
t=2019-10-29T14:36:01+0000 lvl=warn msg="Dqlite server proxy Unix -> TLS: read unix @->@0001e: use of closed network connection" 
t=2019-10-29T14:36:06+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 
t=2019-10-29T14:36:16+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 
t=2019-10-29T14:36:26+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 
t=2019-10-29T14:36:36+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 
t=2019-10-29T14:36:46+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 
t=2019-10-29T14:36:56+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 
t=2019-10-29T14:37:06+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443}]" 

node2:

t=2019-10-29T14:35:48+0000 lvl=info msg="Creating ZFS storage pool \"zfs\"" 
t=2019-10-29T14:35:55+0000 lvl=info msg="Created ZFS storage pool \"zfs\"" 
t=2019-10-29T14:35:55+0000 lvl=eror msg="Failed to get leader node address: Node is not clustered" 
t=2019-10-29T14:35:56+0000 lvl=info msg="Joining dqlite raft cluster" address=10.254.0.3:8443 id=2 target=10.254.0.3:8443
t=2019-10-29T14:35:57+0000 lvl=eror msg="Empty raft node set received" 
t=2019-10-29T14:36:01+0000 lvl=warn msg="Dqlite client proxy TLS -> Unix: read tcp 10.254.0.3:55172->10.254.0.4:8443: use of closed network connection" 

node3:

t=2019-10-29T14:35:48+0000 lvl=info msg="Creating ZFS storage pool \"zfs\"" 
t=2019-10-29T14:35:56+0000 lvl=info msg="Created ZFS storage pool \"zfs\"" 
t=2019-10-29T14:35:56+0000 lvl=eror msg="Failed to get leader node address: Node is not clustered" 
t=2019-10-29T14:35:56+0000 lvl=info msg="Joining dqlite raft cluster" address=10.254.0.5:8443 id=3 target=10.254.0.5:8443
t=2019-10-29T14:35:56+0000 lvl=info msg="Migrate local data to cluster database" 
t=2019-10-29T14:35:57+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.3:8443:{ID:2 Address:10.254.0.3:8443} 10.254.0.4:8443:{ID:1 Address:10.254.0.4:8443} 10.254.0.5:8443:{ID:3 Address:10.254.0.5:8443}]" 
t=2019-10-29T14:35:57+0000 lvl=warn msg="Dqlite client proxy TLS -> Unix: read tcp 10.254.0.5:58170->10.254.0.4:8443: use of closed network connection" 
t=2019-10-29T14:35:57+0000 lvl=eror msg="Failed to retrieve the information of leader member: No such object" 

Node1 global nodes list:

+----+------------+-------------+-----------------+--------+----------------+--------------------------------+---------+
| id |    name    | description |     address     | schema | api_extensions |           heartbeat            | pending |
+----+------------+-------------+-----------------+--------+----------------+--------------------------------+---------+
| 1  | alpha-0001 |             | 10.254.0.4:8443 | 18     | 152            | 2019-10-29T14:40:16.928006527Z | 0       |
| 2  | alpha-0002 |             | 10.254.0.3:8443 | 18     | 152            | 2019-10-29T14:35:55Z           | 1       |
| 3  | alpha-0003 |             | 10.254.0.5:8443 | 18     | 152            | 2019-10-29T14:40:16.928437736Z | 0       |
+----+------------+-------------+-----------------+--------+----------------+--------------------------------+---------+

At this moment i'm unable to setup a working LXD cluster, while i've been able to setup a working cluster in the past with the exact same ansible scripts. Please let me know if i can supply more information

stgraber commented 5 years ago

@suprasoil can you provide the answers you used for lxd init on node1 and the preseed you used for the other two?

Also are those physical systems or are they LXD containers themselves?

stgraber commented 5 years ago

@suprasoil what kind of storage speed are you dealing with? I believe the only time I managed to reproduce this was on a reasonably slow systems. Everything else I'm playing with have stupidly fast NVME SSDs (including our CI runs) and so didn't seem to hit it.

forest-official commented 5 years ago

node1 config:

config:
  core.https_address: 10.254.0.4:8443
  core.trust_password: secret
cluster:
  server_name: alpha-0001
  enabled: true
  member_config: []
  cluster_address: ""
  cluster_certificate: ""
  server_address: ""
  cluster_password: ""
networks: []
storage_pools:
- config:
    source: forest/lxd
  description: ""
  name: zfs
  driver: zfs
- config:
    ceph.cluster_name: ceph
    ceph.osd.pg_num: "128"
    ceph.osd.pool_name: lxd
  description: ""
  name: ceph
  driver: ceph
profiles:
- config:
    boot.autostart: "1"
    raw.idmap: both 1337 0
  description: clients LXD profile
  devices:
    eth0:
      nictype: bridged
      parent: clients
      type: nic
    root:
      path: /
      pool: zfs
      type: disk
  name: default

Node2 & 3(although with different address & name) config:

cluster:
  server_name: alpha-0002
  enabled: true
  cluster_address: 10.254.0.4:8443
  cluster_certificate: |
    -----BEGIN CERTIFICATE-----
    removed
    -----END CERTIFICATE-----

  server_address: 10.254.0.3:8443
  cluster_password: secret
storage_pools:
- config:
    source: forest/lxd
  description: ""
  name: zfs
  driver: zfs

Hosts are KVM/qemu virtual machines running from a SATA600 SSD.

forest-official commented 5 years ago

Would you have any tips on how to work around this problem? I'm currenly unable to setup a cluster.

I've tried setting it up automatically and manually, both result in this problem.

forest-official commented 5 years ago

Tested setting up cluster manually on 3 fresh ubuntu 18.04 hosts with latest LXD snap, which seems to be working fine, will investigate further.

turtle0x1 commented 5 years ago

They are running 3.0.3 through right? Cause that works fine in my testing

forest-official commented 5 years ago

Nope, they're running latest snap, editted comment.

forest-official commented 5 years ago

Every time i try this with my 3 test nodes, the first node initializes just fine, one of the other nodes connects just fine to the first node. It's always the last node that won't connect and causes the perpetual "unaccounted raft node(s) not found" errors.

forest-official commented 5 years ago

I should mention the nodes are running of a ZFS root filesystem, could that have something to do with this? From what i could find, it seems sqlite could be quite a lot slower on ZFS.

forest-official commented 5 years ago

I've been setting up LXD clusters on the same ZFS setup for over a year now though, haven't encountered this before.

freeekanayaka commented 5 years ago

@suprasoil the YAML preseed that you pasted for node 2/3 seems wrong:

cluster:
  server_name: alpha-0002
  enabled: true
  cluster_address: 10.254.0.4:8443
  cluster_certificate: |
    -----BEGIN CERTIFICATE-----
    removed
    -----END CERTIFICATE-----

  server_address: 10.254.0.3:8443
  cluster_password: secret
storage_pools:
- config:
    source: forest/lxd
  description: ""
  name: zfs
  driver: zfs

You should only have a cluster section there, and provide storage-related configuration options in member_config. The YAML you are using might have worked with 3.0.3, but the format has changed since then.

Please see https://lxd.readthedocs.io/en/latest/clustering/#preseed

forest-official commented 5 years ago

After looking through the clustering logic, and seeing that we're looking up nodes from the raft_nodes table i discovered that none of my nodes have a raft_nodes table.

root@alpha-0001:/# lxd sql global "select name from sqlite_master where type='table';"
+------------------------------------+
|                name                |
+------------------------------------+
| schema                             |
| sqlite_sequence                    |
| certificates                       |
| config                             |
| images                             |
| images_aliases                     |
| images_nodes                       |
| images_properties                  |
| images_source                      |
| instances                          |
| instances_backups                  |
| instances_config                   |
| instances_devices                  |
| instances_devices_config           |
| instances_profiles                 |
| instances_snapshots                |
| instances_snapshots_config         |
| instances_snapshots_devices        |
| instances_snapshots_devices_config |
| networks                           |
| networks_config                    |
| networks_nodes                     |
| nodes                              |
| operations                         |
| profiles                           |
| profiles_config                    |
| profiles_devices                   |
| profiles_devices_config            |
| projects                           |
| projects_config                    |
| storage_pools                      |
| storage_pools_config               |
| storage_pools_nodes                |
| storage_volumes                    |
| storage_volumes_config             |
+------------------------------------+
forest-official commented 5 years ago

@freeekanayaka I will look into that and report back.

freeekanayaka commented 5 years ago

Probably you need something like:

  member_config:
  - entity: storage-pool
    name: zfs
    key: source
    value: forest/lxd
freeekanayaka commented 5 years ago

Also, FYI the raft_nodes table is in the local database:

lxd sql local "SELECT * FROM raft_nodes"
forest-official commented 5 years ago

Also, FYI the raft_nodes table is in the local database:

Doh, could have guessed that.

the YAML preseed that you pasted for node 2/3 seems wrong:

I've tried setting it up manually by using the lxd init command as well, also results in this problem.

Now using this config:

cluster:
  server_name: alpha-0002
  enabled: true
  cluster_address: 10.254.0.4:8443
  cluster_certificate: |
    -----BEGIN CERTIFICATE-----
    -----END CERTIFICATE-----

  server_address: 10.254.0.3:8443
  cluster_password: pass
  member_config:
  - entity: storage-pool
    name: zfs
    driver: zfs
    key: source
    value: "forest/lxd"

First node is again logging these errors:

t=2019-11-06T13:36:07+0000 lvl=eror msg="Unaccounted raft node(s) not found in 'nodes' table for heartbeat: map[10.254.0.5:8443:{ID:2 Address:10.254.0.5:8443}]"

Raft nodes table on first node:

root@alpha-0001:/# lxd sql local "select * from raft_nodes;"
+----+-----------------+
| id |     address     |
+----+-----------------+
| 1  | 10.254.0.4:8443 |
| 2  | 10.254.0.5:8443 |
+----+-----------------+

Nodes table on first node:

root@alpha-0001:/# lxd sql global "select * from nodes;"
+----+------------+-------------+-----------------+--------+----------------+--------------------------------+---------+
| id |    name    | description |     address     | schema | api_extensions |           heartbeat            | pending |
+----+------------+-------------+-----------------+--------+----------------+--------------------------------+---------+
| 1  | alpha-0001 |             | 10.254.0.4:8443 | 17     | 143            | 2019-11-06T13:37:09.434069101Z | 0       |
| 2  | alpha-0003 |             | 10.254.0.5:8443 | 17     | 143            | 2019-11-06T13:33:53Z           | 1       |
| 3  | alpha-0002 |             | 10.254.0.3:8443 | 17     | 143            | 2019-11-06T13:37:09.43430051Z  | 0       |
+----+------------+-------------+-----------------+--------+----------------+--------------------------------+---------+

Also, node 2 is marked as part of the cluster, but it's address is not in the nodes list:

root@alpha-0001:/# lxc cluster list
+------------+-------------------------+----------+--------+-------------------+
|    NAME    |           URL           | DATABASE | STATE  |      MESSAGE      |
+------------+-------------------------+----------+--------+-------------------+
| alpha-0001 | https://10.254.0.4:8443 | YES      | ONLINE | fully operational |
+------------+-------------------------+----------+--------+-------------------+
| alpha-0002 | https://10.254.0.3:8443 | NO       | ONLINE | fully operational |
+------------+-------------------------+----------+--------+-------------------+
freeekanayaka commented 5 years ago

If you notice, the pending column for alpha-0003 in the nodes table is set to 1, suggesting that it did not successfully join. That's the part you need to check: what error did it throw (if any)? What do you see in the logs?

forest-official commented 5 years ago

Same result as this: https://github.com/lxc/lxd/issues/6313#issuecomment-547452963

t=2019-11-06T14:25:31+0000 lvl=eror msg="Failed to get leader node address: Node is not clustered" 
t=2019-11-06T14:25:31+0000 lvl=info msg="Joining dqlite raft cluster" address=10.254.0.5:8443 id=2 target=10.254.0.5:8443
t=2019-11-06T14:25:35+0000 lvl=eror msg="Empty raft node set received" 
t=2019-11-06T14:25:36+0000 lvl=warn msg="Dqlite client proxy TLS -> Unix: read tcp 10.254.0.5:45070->10.254.0.4:8443: use of closed network connection"

I've been trying this from snapshots a lot of times now, always getting these errors. 1 try ago though, setting up the cluster suddenly worked, so this problem seems performance (zfs?) related.