LINBIT / linstor-server

High Performance Software-Defined Block Storage for container, cloud and virtualisation. Fully integrated with Docker, Kubernetes, Openstack, Proxmox etc.
https://docs.linbit.com/docs/linstor-guide/
GNU General Public License v3.0
843 stars 72 forks source link

Metrics scraping blocking HTTP heath check #172

Open kvaps opened 3 years ago

kvaps commented 3 years ago

Hi, when I enable /metrics scraping then /healh is stopping working on plain listener, eg.:

# curl 'http://localhost:3370/health'
<stuck>
# curl 'http://localhost:3370/metrics'
<stuck>
# curl --cacert /tls/ca.crt --cert /tls/tls.crt --key /tls/tls.key 'https://localhost:3371/health';
<ok>

and after a while:

# curl --cacert /tls/ca.crt --cert /tls/tls.crt --key /tls/tls.key 'https://localhost:3371/health'; echo
Services not running: NetComService
# curl 'http://localhost:3370/health'; echo
Services not running: NetComService

but linstor is still working:

# linstor c v
linstor controller 1.8.0; GIT-hash: e56b6c2a80b6d000921a998e3ba4cd1102fbdd39
rp- commented 3 years ago

so the first /health is already stuck? is there high load on linstor? any other action at that time?

kvaps commented 3 years ago

No other load, I just enabled /metrics scraping by three vmagents each 10 seconds, I'll try to reduce this number.

kvaps commented 3 years ago

And now all the nodes become offline:

╭───────────────────────────────────────────────────────╮
┊ Node  ┊ NodeType  ┊ Addresses               ┊ State   ┊
╞═══════════════════════════════════════════════════════╡
┊ m1c4  ┊ SATELLITE ┊ 10.28.36.164:3367 (SSL) ┊ OFFLINE ┊
┊ m1c5  ┊ SATELLITE ┊ 10.28.36.165:3367 (SSL) ┊ OFFLINE ┊
┊ m1c6  ┊ SATELLITE ┊ 10.28.36.166:3367 (SSL) ┊ OFFLINE ┊
┊ m1c7  ┊ SATELLITE ┊ 10.28.36.167:3367 (SSL) ┊ OFFLINE ┊
┊ m1c8  ┊ SATELLITE ┊ 10.28.36.168:3367 (SSL) ┊ OFFLINE ┊
┊ m1c9  ┊ SATELLITE ┊ 10.28.36.169:3367 (SSL) ┊ OFFLINE ┊
┊ m1c10 ┊ SATELLITE ┊ 10.28.36.170:3367 (SSL) ┊ OFFLINE ┊
┊ m1c12 ┊ SATELLITE ┊ 10.28.36.172:3367 (SSL) ┊ OFFLINE ┊
┊ pve1  ┊ SATELLITE ┊ 10.28.36.159:3367 (SSL) ┊ OFFLINE ┊
┊ pve2  ┊ SATELLITE ┊ 10.28.36.160:3367 (SSL) ┊ OFFLINE ┊
┊ pve3  ┊ SATELLITE ┊ 10.28.36.161:3367 (SSL) ┊ OFFLINE ┊
╰───────────────────────────────────────────────────────╯

controller.log