apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
2.09k stars 1.11k forks source link

Suggestion: When Global Setting such as (network.loadbalancer.haproxy.max.conn) is changed, mark VR as 'Requires Upgrade' instead of marking it as failed healtcheck. #9800

Open btzq opened 3 weeks ago

btzq commented 3 weeks ago
ISSUE TYPE
COMPONENT NAME
Virtual Router, HA Proxy
CLOUDSTACK VERSION
4.19.1
CONFIGURATION
OS / ENVIRONMENT
SUMMARY

One of our customers required larger HA Proxy Max Connections as they have many users connecting at the same time.

So, we change the default value of the below parameter in Global Settings to a new one:

Once implemented, and we restarted the cloudstack server, we got a whole bunch of healthcheck failures.

Screenshot below: Screenshot 2024-10-15 at 10 29 55 PM

Screenshot 2024-10-15 at 10 31 06 PM

In this case, I dont think this should be counted as a healthcheck issue. Because the service seems to be working fine.

I think what would be a better experience for the operator, is to mark the router as 'Requires Upgrade'.

Because the VR does not need to be re-created. It just needed to be forced rebooted. (FYI, normal reboot doesnt seem to cause the VR to load the new maxconn value).

And as an operator, we rely on the 'Alert' section to ensure all customer VR are working normally. This current behavior creates alot of noise.

Even better, is for each customer to be able set their own (network.loadbalancer.haproxy.max.conn) value, and additional settings. Because not all customers requires such large values.

STEPS TO REPRODUCE
Refer above
EXPECTED RESULTS
Mark the router as 'Requires Upgrade', when a Global Setting is changed, such as network.loadbalancer.haproxy.max.conn
ACTUAL RESULTS
Bombarded with Health Check fails for all VRs created, which requires manual force reboot or cleanup VR. (normal reboot doesnt work).
weizhouapache commented 2 weeks ago

seems to be a valid bug