Open qrkourier opened 1 year ago
There is a router health check configuration (link to reference) that can be added to the router config that we use currently to verify the health of a router, e.g., whether it's connected to the controller ctrlPingCheck
.
It doesn't specifically show it's "ready to receive traffic", because that could mean many things, since it could be connected to certain links while others are having issues/down. This is just a general check to ensure the router receives updates from the controller.
The most recent addition to the health check is linkCheck
, so you can verify if the router has a link or even a specific link to another router, indicating it's ready to transport traffic.
To add the health check to the ER, you need to add something like this to the ER config:
healthChecks:
ctrlPingCheck:
interval: 30s
timeout: 15s
initialDelay: 15s
linkCheck:
minLinks: 1
interval: 5s
initialDelay: 5s
add a web section to the ER like this:
web:
- name: health-check
bindPoints:
- interface: 0.0.0.0:8081
address: 0.0.0.0:8081
apis:
- binding: health-checks
This combination would allow you to GET https://localhost:8081/health-checks
The above would produce something like this as an output:
{
"data": {
"checks": [
{
"details": null,
"healthy": true,
"id": "controllerPing",
"lastCheckDuration": "4.344µs",
"lastCheckTime": "2024-02-13T18:40:01Z"
},
{
"details": [
{
"linkId": "lndLOpwd7yOcSXtCcPwWf",
"destRouterId": "j.LOxzd9A",
"latency": 3271785.96875,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:34.199.168.165:61031"
},
"payload": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:34.199.168.165:65235"
}
}
},
{
"linkId": "3W72EY2a0inbyCHIdYk6Gd",
"destRouterId": "f9fs.nvej",
"latency": 84934025.1015625,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:35.181.192.76:42764"
},
"payload": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:35.181.192.76:42750"
}
}
},
{
"linkId": "1yp3sDwqj6CHui4Zmt89wB",
"destRouterId": "PKud5nLtj",
"latency": 188746151.2265625,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:58616",
"remoteAddr": "tcp:52.66.46.9:443"
},
"payload": {
"localAddr": "tcp:10.19.116.60:52392",
"remoteAddr": "tcp:52.66.46.9:443"
}
}
},
{
"linkId": "5wjStaisGHkT5Xu0fQrfEq",
"destRouterId": "bnq85xLt3",
"latency": 201215039.2109375,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:18.61.94.28:48878"
},
"payload": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:18.61.94.28:48864"
}
}
},
{
"linkId": "4e0RACw8dsguV43BrAsQfc",
"destRouterId": "u6Q1QSPulm",
"latency": 3872621.265625,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:132.145.157.243:48868"
},
"payload": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:132.145.157.243:48864"
}
}
},
{
"linkId": "6AmAWcfB50a2OzFvlsr1vn",
"destRouterId": "7fTQPzdt7d",
"latency": 3430749.8671875,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:3.217.193.94:50371"
},
"payload": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:3.217.193.94:25962"
}
}
},
{
"linkId": "1CnhJJ73e1AiKjoVoRy3tt",
"destRouterId": "oWeCqGOcJ",
"latency": 2906363.9140625,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:52.54.127.95:56812"
},
"payload": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:52.54.127.95:56796"
}
}
},
{
"linkId": "3vfjSpoNfoYbyqIBbT7ZKx",
"destRouterId": "R7nKHgLtj",
"latency": 66181343.484375,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:45646",
"remoteAddr": "tcp:44.225.183.166:443"
},
"payload": {
"localAddr": "tcp:10.19.116.60:45640",
"remoteAddr": "tcp:44.225.183.166:443"
}
}
},
{
"linkId": "2MaMaVsqFNtvhRejCpyR7y",
"destRouterId": "s3FjWqdlS",
"latency": 70628346.7578125,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:54.77.98.202:57722"
},
"payload": {
"localAddr": "tcp:10.19.116.60:443",
"remoteAddr": "tcp:54.77.98.202:57720"
}
}
},
{
"linkId": "2EVj6GaGBr1KEFXSeypc3i",
"destRouterId": "joI2Wqdlb",
"latency": 187742628.4765625,
"addresses": {
"ack": {
"localAddr": "tcp:10.19.116.60:38150",
"remoteAddr": "tcp:15.207.241.220:443"
},
"payload": {
"localAddr": "tcp:10.19.116.60:38136",
"remoteAddr": "tcp:15.207.241.220:443"
}
}
}
],
"healthy": true,
"id": "link.health",
"lastCheckDuration": "120.997µs",
"lastCheckTime": "2024-02-13T18:40:06Z"
}
],
"healthy": true
},
"meta": {}
}
Link to controller health-checks reference: https://openziti.io/docs/reference/configuration/controller#healthchecks
K8s will use each type of probe if we make them available. A successful probe means:
Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/