telefonicaid / fiware-orion

Context Broker and CEF building block for context data management, providing NGSI interfaces.
https://fiware-orion.rtfd.io/
GNU Affero General Public License v3.0
212 stars 265 forks source link

Support Kubernetes deployment scenarios #3758

Open wistefan opened 3 years ago

wistefan commented 3 years ago

To run orion stable and performant in kubernetes environments, it would be good to have have solutions for the following 2 problems:

1. Health check

In order to work properly on kubernetes an application needs to provide a health endpoint. You can find a description about readiness/liveness here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ Currently, the orion tutorials suggest the usage of the /version endpoint. Using that as a health endpoint leads to unwanted restarts in high load scenarios, since orion does start to close connections to mitigate the load. Due to that, it also rejects the health requests and kubernetes assumes orion to be broken. In order to fix that, kubernetes restarts the container. That behaviour further worsens the problem, since orion is not able to work through the outstanding requests while kubernetes does horizontally scale the instances to improve the available resources. Because of this, an endpoint that serves the purpose of a health-check independently from the normal api is required.

2. Respect memory restrictions

Kubernetes(and other container-solutions like pure docker) provide functionality to restrict the resource(CPU, Memory) usage of a container. If a container reaches its CPU restrictions, it simply slows down. But if it tries to allocate more memory than it got assigned, kubernetes will kill the container in order to prevent negative influence on the other workloads in the environment. This again has the effect that the system cannot scale orion automatically(or even by manual intervention) in high load scenarios. It would be good if the container simply slows down and/or rejects connections until enough memory is available again.

For trying out orion in a kubernetes environment, this helm-chart can be used: https://github.com/FIWARE/helm-charts/tree/main/charts/orion

kzangeli commented 3 years ago

FYI, I implemented a simple TCP socket listener in Orion-LD - for kubernetes to connect and disconnect, using this as health check instead of trying to connect to the "REST" socket that is offered for normal requests.

The performance went up with more than 300% after this simple addition - from some 250-300 requests per second (per core) to almost 1000.

So, if the broker is to run under kubernetes, something similar to what I did in Orion-LD is definitely necessary.

fgalan commented 3 years ago

The performance went up with more than 300% after this simple addition - from some 250-300 requests per second (per core) to almost 1000.

Do you have figures when health check is disabled at all? To have somekind of "upper limit"

kzangeli commented 3 years ago

Do you have figures when health check is disabled at all? To have some kind of "upper limit"

My guess would be that those 1000 is really very close to the upper limit. Very little "broker time" is spent listening on and accepting a connection, and I can't imagine that kubernetes tries to connect more than perhaps once a second - but I'm no expert on kubernetes, far from it :) Stefan probably knows more (or anybody with knowledge on kubernetes)

mapedraza commented 3 years ago

Regarding the first item (health check), the problem of using a separate endpoint to doing the check (i.e. an extra port opened by the contextBroker process, let’s say port number 6201) is that endpoint doesn’t necessarily guarantee that CB is providing service. I mean, you can check that opening-closing a connection to 6201 is fine, but it doesn’t necessarily mean that the service in port 1026 is working as expected (the MHD server at 1026 could be completely broken due a bug in Orion, but the 6201 still surviving, giving the wrong feeling that CB is working well). Thus health check with GET /version on 1026 is closer to provide a precise health check (it could be even better doing a check on GET /v2/entities, as that operations “traverses” all the Orion CB logic, including DB backend, while GET /version only scares the surface).

In fact, a good health check which at the same time is fast (indeed, faster than port checking) and ensures CB is providing service would be to check for summary traces in the logs (functionality described at https://fiware-orion.readthedocs.io/en/master/admin/logs/index.html#summary-traces). These traces can be used as a keepalive, so the health check can just check if they appear in the logs with a given frequency (the one of the -logSummary setting). Could you configure the health check to work this way?

Regarding the second item (memory restrictions), we understand the situation is not different from Orion running in a standard operations system. If you run Orion is a place with limited memory (how much memory are you using? From https://github.com/FIWARE/helm-charts/tree/main/charts/orion, I see 128Mi, but I’m not sure if it is the one you are finally using) it is normal it breaks. Note that documentation at https://fiware-orion.readthedocs.io/en/master/admin/diagnosis/index.html#resource-availability recommends a minimum of 4GB. So, probably with a proper sizing and scaling policy (i.e. put the scaling threshold far enough from the memory limit so the scaling system have enough time to spawn the new CB instance before the former one exhausts memory) this situation can be solved (or, at least, alleviate).

wistefan commented 3 years ago

Regarding health-check: I fully agree that the socket-solution is far from ideal and should be extended to actually test something. It would be an option to use one of the normal endpoints, if you can guarantee that requests are answered in highload situations, instead of being terminated. If Orion receives a burst of requests, the connection pool gets exhausted and it start to reject connections. One of those connections is the health-check, that leads Kubernetes to think Orion is in a state from which it cannot recover by itself. But overall, Orion is able to work in that situation and can recover by working through its open connections and then continuing to accept new once. If the load is only a short-timed burst, the load will get lesser and nothing needs to be done. If the load stays high, Kubernetes can automatically mitigate this situation by scaling up(e.g. adding more instances of Orion). But due to the failing health-check, it assumes that Orion should be restarted to mitigate the problem, thus even making the situation worse. The information inside the summary traces would be what I'm looking for in a health check, but Kubernetes only supports 3 types of health-checks: http-requests, tcp-connections and commands(see https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/). Wouldn't it be an option to map the information from the summary traces to an dedicated http-endpoint(running on a different thread-pool than the ngsi api), that can be used for health-checks?

Regarding memory: Your right that it behaves the same way as without Kubernetes. But since Kubernetes is able to automatically mitigate resource pressure by horizontal scaling, it would be better if Orion does not simply die, but instead slow down until the system was able to react. I know from @kzangeli that this is a quite complex problem and might not worth the effort. The line you are refering to in the helm-chart is commented out, its best practice to not configure resource requests in a helm chart. We used 6, 12 and 48 GB in our load tests.

mapedraza commented 3 years ago

Regarding health-check: Since the information is available on the Orion logs, it is possible to parse and process it using bash commands to perform the health-check (we understnad that with the third supported type of kubernetes health-chack, command, a check based in a bash script that way can be implemented, right?). It is available right now. Is this solution right for you?

Regarding memory: Thanks for the clarification related with helm-chart recipes, Orion needs enought resources, just to clarify (as you mentioned, is not your problem). Setting an autoscalling thresholds that allows to spawn a new CB instance before the previuos one exhaust all the memory would let the first instance stay alive.