yyyar / gobetween

:cloud: Modern & minimalistic load balancer for the Сloud era
http://gobetween.io
Other
1.94k stars 210 forks source link

gobetween - healthcheck - exec timeout error #309

Closed adeelaziz7 closed 3 years ago

adeelaziz7 commented 3 years ago

Hi,

I am using gobetween to configure 2 nodes and a windows batch file (healthcheck.bat) to use a curl command which in turn calls a REST API of my application on the node to fetch status. The healthcheck.bat when given IP and port on command line returns health status in form of 0 and 1 correctly. Here is my configuration:

[servers.apigateway.healthcheck] kind = "exec" interval = "6s" ping_timeout_duration = "5000ms"

I even set the timeout in the curl to 5 sec even then I get these following errors in log (for security I have hidden the actual IPs with xxx.xxx.xxx.x):

2021-01-05 09:40:41 [INFO ] (manager): Initializing... 2021-01-05 09:40:41 [INFO ] (server): Creating 'softphone': xxx.xxx.xxx.1:8449 iphash1 static none 2021-01-05 09:40:41 [INFO ] (scheduler): Starting scheduler softphone 2021-01-05 09:40:41 [INFO ] (server): Creating 'apigateway': xxx.xxx.xxx.1:8450 iphash1 static exec 2021-01-05 09:40:41 [INFO ] (scheduler): Starting scheduler apigateway 2021-01-05 09:40:41 [INFO ] (manager): Initialized 2021-01-05 09:40:41 [INFO ] (metrics): Metrics disabled 2021-01-05 09:40:41 [INFO ] (api): API disabled 2021-01-05 09:41:59 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.2 8445] is timed out. Killing process... 2021-01-05 09:41:59 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:41:59 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.3 8445] is timed out. Killing process... 2021-01-05 09:41:59 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} false} 2021-01-05 09:41:59 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:41:59 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} false} 2021-01-05 09:42:05 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} true} 2021-01-05 09:42:05 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} true} 2021-01-05 09:46:41 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.2 8445] is timed out. Killing process... 2021-01-05 09:46:41 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.3 8445] is timed out. Killing process... 2021-01-05 09:46:41 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:46:41 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} false} 2021-01-05 09:46:41 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:46:41 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} false} 2021-01-05 09:46:47 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} true} 2021-01-05 09:46:47 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} true} 2021-01-05 09:50:53 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.2 8445] is timed out. Killing process... 2021-01-05 09:50:53 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.3 8445] is timed out. Killing process... 2021-01-05 09:50:53 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:50:53 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} false} 2021-01-05 09:50:53 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:50:53 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} false} 2021-01-05 09:50:59 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.2 8445] is timed out. Killing process... 2021-01-05 09:50:59 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.3 8445] is timed out. Killing process... 2021-01-05 09:50:59 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:50:59 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:51:05 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} true} 2021-01-05 09:51:05 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} true} 2021-01-05 09:51:53 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.3 8445] is timed out. Killing process... 2021-01-05 09:51:53 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:51:53 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} false} 2021-01-05 09:51:59 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} true} 2021-01-05 09:52:29 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.3 8445] is timed out. Killing process... 2021-01-05 09:52:29 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 09:52:29 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} false} 2021-01-05 09:52:35 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} true} 2021-01-05 10:01:47 [INFO ] (execTimeout): Response from exec [healthcheck.bat xxx.xxx.xxx.2 8445] is timed out. Killing process... 2021-01-05 10:01:47 [WARNI] (healthcheck/exec): exit status 1 2021-01-05 10:01:47 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} false} 2021-01-05 10:01:53 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.2 8445} true} 2021-01-05 10:06:46 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} false} 2021-01-05 10:07:45 [INFO ] (healthcheck/worker): Sending to scheduler: {{xxx.xxx.xxx.3 8445} true}

Can you provide me a solution to this?

Many thanks,

Adeel.

yyyar commented 3 years ago

Hi @adeelaziz7 Your config seems not complete. Could your provide full healthcheck section? ping_timeout_duration is not a option for exec healthcheck. You probably need to set timeout property instead.

Here is valid config options for exec from docs:

[servers.default.healthcheck]   # (optional)
interval = "2s"                         # (required) healthcheck running interval
timeout = "0s"                           # (required) max time for healthcheck to execute until mark as unhealthy
fails = 5                                     # (optional) consecutive number of checks that should fail, to mark backend as unhealthy
passes = 2                                # (optional) consecutive number of checks that should pass, to mark backend as healthy
initial_status = "healthy"           # (optional) "healthy" | "unhealthy"
kind = "exec"
exec_command = "/path/to/healthcheck.sh"      # (required) command to execute
exec_expected_positive_output = "1"                # (required) expected output of command in case of success
exec_expected_negative_output = "0"               # (required) expected output of command in case of failure
adeelaziz7 commented 3 years ago

Hi @yyyar Thanks for the quick response, it worked! I had mixed up ping configuration with exec.