LINBIT / linstor-server

High Performance Software-Defined Block Storage for container, cloud and virtualisation. Fully integrated with Docker, Kubernetes, Openstack, Proxmox etc.
https://docs.linbit.com/docs/linstor-guide/
GNU General Public License v3.0
961 stars 76 forks source link

Sometimes nodes don't become Offline in case when satellites are shuted down #331

Open kvaps opened 1 year ago

kvaps commented 1 year ago

This issue is not related with #330, but it is the same cluster.

Once I start testing this, I faced with situation when linstor-satellite is shuted down, but linstor-controller continues considering it as Online.

I see this issue sometimes in other clusters. Usually restart of linstor-controller makes it working again:

# linstor n l
Defaulted container "linstor-controller" out of: linstor-controller, kube-rbac-proxy
╭───────────────────────────────────────────────────────────────────────────────────────╮
┊ Node                                ┊ NodeType   ┊ Addresses                ┊ State   ┊
╞═══════════════════════════════════════════════════════════════════════════════════════╡
┊ hf-virt-01                          ┊ SATELLITE  ┊ 95.217.77.109:3367 (SSL) ┊ Online  ┊
┊ hf-virt-02                          ┊ SATELLITE  ┊ 95.217.77.33:3367 (SSL)  ┊ Online  ┊
┊ hf-virt-03                          ┊ SATELLITE  ┊ 95.217.77.30:3367 (SSL)  ┊ Online  ┊
┊ linstor-controller-6d6cd57c58-d2h9f ┊ CONTROLLER ┊ 10.111.2.1:3367 (SSL)    ┊ OFFLINE ┊
┊ linstor-controller-6d6cd57c58-m5rb6 ┊ CONTROLLER ┊ 10.111.1.166:3367 (SSL)  ┊ Online  ┊
╰───────────────────────────────────────────────────────────────────────────────────────╯
# linstor sp l
Defaulted container "linstor-controller" out of: linstor-controller, kube-rbac-proxy
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node       ┊ Driver   ┊ PoolName     ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State   ┊ SharedName ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ hf-virt-01 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Warning ┊            ┊
┊ DfltDisklessStorPool ┊ hf-virt-02 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Warning ┊            ┊
┊ DfltDisklessStorPool ┊ hf-virt-03 ┊ DISKLESS ┊              ┊              ┊               ┊ False        ┊ Warning ┊            ┊
┊ thindata             ┊ hf-virt-01 ┊ LVM_THIN ┊ vg0/thindata ┊              ┊               ┊ True         ┊ Warning ┊            ┊
┊ thindata             ┊ hf-virt-02 ┊ LVM_THIN ┊ vg0/thindata ┊              ┊               ┊ True         ┊ Warning ┊            ┊
┊ thindata             ┊ hf-virt-03 ┊ LVM_THIN ┊ vg0/thindata ┊              ┊               ┊ True         ┊ Warning ┊            ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
WARNING:
Description:
    No active connection to satellite 'hf-virt-01'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
WARNING:
Description:
    No active connection to satellite 'hf-virt-02'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
WARNING:
Description:
    No active connection to satellite 'hf-virt-03'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
# linstor n l
Defaulted container "linstor-controller" out of: linstor-controller, kube-rbac-proxy
╭───────────────────────────────────────────────────────────────────────────────────────╮
┊ Node                                ┊ NodeType   ┊ Addresses                ┊ State   ┊
╞═══════════════════════════════════════════════════════════════════════════════════════╡
┊ hf-virt-01                          ┊ SATELLITE  ┊ 95.217.77.109:3367 (SSL) ┊ Online  ┊
┊ hf-virt-02                          ┊ SATELLITE  ┊ 95.217.77.33:3367 (SSL)  ┊ Online  ┊
┊ hf-virt-03                          ┊ SATELLITE  ┊ 95.217.77.30:3367 (SSL)  ┊ Online  ┊
┊ linstor-controller-6d6cd57c58-d2h9f ┊ CONTROLLER ┊ 10.111.2.1:3367 (SSL)    ┊ OFFLINE ┊
┊ linstor-controller-6d6cd57c58-m5rb6 ┊ CONTROLLER ┊ 10.111.1.166:3367 (SSL)  ┊ Online  ┊
╰───────────────────────────────────────────────────────────────────────────────────────╯
kvaps commented 1 year ago

Why is this can happen? I see the same behavior in different clusters

# linstor n l
+---------------------------------------------------------------------------------------+
| Node                                | NodeType   | Addresses                | State   |
|=======================================================================================|
| hf-virt-01                          | SATELLITE  | 192.168.100.1:3367 (SSL) | Online  |
| hf-virt-02                          | SATELLITE  | 192.168.100.2:3367 (SSL) | Online  |
| hf-virt-03                          | SATELLITE  | 192.168.100.3:3367 (SSL) | Online  |
| linstor-controller-75b5cf65b4-4m876 | CONTROLLER | 10.111.2.172:3367 (SSL)  | OFFLINE |
| linstor-controller-75b5cf65b4-6p6rx | CONTROLLER | 10.111.0.122:3367 (SSL)  | Online  |
+---------------------------------------------------------------------------------------+
# linstor sp l
+----------------------------------------------------------------------------------------------------------------------------------+
| StoragePool          | Node       | Driver   | PoolName     | FreeCapacity | TotalCapacity | CanSnapshots | State   | SharedName |
|==================================================================================================================================|
| DfltDisklessStorPool | hf-virt-01 | DISKLESS |              |              |               | False        | Warning |            |
| DfltDisklessStorPool | hf-virt-02 | DISKLESS |              |              |               | False        | Warning |            |
| DfltDisklessStorPool | hf-virt-03 | DISKLESS |              |              |               | False        | Warning |            |
| thindata             | hf-virt-01 | LVM_THIN | vg0/thindata |              |               | True         | Warning |            |
| thindata             | hf-virt-02 | LVM_THIN | vg0/thindata |              |               | True         | Warning |            |
| thindata             | hf-virt-03 | LVM_THIN | vg0/thindata |              |               | True         | Warning |            |
+----------------------------------------------------------------------------------------------------------------------------------+
WARNING:
Description:
    No active connection to satellite 'hf-virt-01'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
WARNING:
Description:
    No active connection to satellite 'hf-virt-02'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
WARNING:
Description:
    No active connection to satellite 'hf-virt-03'
Details:
    The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
kvaps commented 12 months ago

Related to https://github.com/LINBIT/linstor-server/issues/219

Every time seeing this we have Target decrypted buffer is too small! error in a log