apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.27k stars 1.23k forks source link

Server HealthCheck StatusCallback should separate instance and table/segment health check #9023

Closed walterddr closed 1 year ago

walterddr commented 2 years ago

Currently BaseServerStarter will register status health check callback with all the enabled table on that server. It will only return health OK once all table/segments are loaded.

This will cause problem for server with large amount of table/data to be loaded, running in a container environment.

Proposal

Reference

[OLD PROPOSAL]

[OLDER PROPOSAL] propose to strip table/segment status callback from server health check endpoint and only check server's health. and create /health/services/ endpoint to check server + table resource health and only returns OK if everything is healthy

mcvsubbu commented 2 years ago

We need this check in our environment. Please do not remove it.

Instead, you can use this config variable (set it to 0) to disable it altogether.

https://github.com/apache/pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java#L370

walterddr commented 2 years ago

ok yes that would be changing the semantics of a REST API.

counter proposal

walterddr commented 2 years ago

@mcvsubbu @mayankshriv @npawar ^

mcvsubbu commented 2 years ago

The existing behavior of loading all segments was put in place to allow for rolling restarts and upgrades. With this behavior, we can use healthcheck after an upgrade or restart and rest assured that we don't compromise on availability. @walterddr how do you propose to solve this problem in the container environment?

walterddr commented 2 years ago

I think the issue I am trying to solve here is orthogonal to the rolling restart problem, which is an intended restart. I am happy to create another issue to address that for the containerized environment.

this issue mainly discusses the unintentional restart triggered by the containerized environment. Here I propose to create additional endpoints to retrieve "instance availability" and "data availability" separately. It shouldn't affect how the rolling restart issue being addressed, containerized or not.

jadami10 commented 1 year ago

We have the same problem where servers take too long to go healthy and the ASG recycles them forever. This is for intentional restarts as well as unintentional ones. Rewording what you said to see if we agrees; there does seem to be some issue here where the /health endpoint is doing too much:

I'm +1 for your counter proposal. We can use /health/instance for whatever instance/container manager is being used and leave /health as is for backwards compatibility.

mcvsubbu commented 1 year ago

+1 for new endpoint, yes

Jackie-Jiang commented 1 year ago

I'd suggest adding a new API with more information in the response:

It is much more flexible this way, and different client can choose how to parse the response. E.g. liveness check can pass when the server is not BAD, readiness check can pass when server is GOOD

jadami10 commented 1 year ago

One concern I have with a more complex API is most load balancers only allow configuring the path and then look at the status code of the response.

So having both /health/instance and /health/services like in the proposal makes most sense to me.

Jackie-Jiang commented 1 year ago

One concern I have with a more complex API is most load balancers only allow configuring the path and then look at the status code of the response.

So having both /health/instance and /health/services like in the proposal makes most sense to me.

@jadami10 What if we return different status code for different status? Current API only supports 200 and 503, which cannot represent all the available status. One problem is that different client might have different logic on parsing the pinot service status, and fixing the return type can reduce the flexibility. We don't want to add an extra new API for each different status check logic.

jadami10 commented 1 year ago

I don't think your typical ELB parses that way. Any non-200 code is considered failing. And in this case we want a server that's up but downloading segments/catching up to be a 200 on one endpoint and non-200 on another.

@walterddr, do you plan to add a healthcheck metrics for the server while you're in here

HEALTHCHECK_OK_CALLS("healthcheck", true),
HEALTHCHECK_BAD_CALLS("healthcheck", true),

similar what broker and controller have?

walterddr commented 1 year ago

ENUM type in Response vs. Different end points

IMO it is best to use different endpoints instead of the same endpoint with varying types of response

Some Thoughts

Seems like it is causing some confusion and addressing both issues at the same time also distract the discussion. So I would say let's focus on the main purpose of the discussion to find a way to allow containerization software to know whether a pinot server is alive vs. a pinot server is ready for serving queries.

if there's a way to easily configure k8s or ELB to differentiate the 2 server state. I am ok with using just one endpoint.

Jackie-Jiang commented 1 year ago

Checked other systems, and seems health check is mostly for liveness instead of readiness. Since we already use it as the readiness check, we can leave it as is for backward compatibility.

Based on the requirement, I'd suggest making the api more explicit on the purpose:

I feel using instance and service is not as clear

walterddr commented 1 year ago

sounds good. I will create these. ^ @jadami10 any follow up?

jadami10 commented 1 year ago

nope, those look great