knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.56k stars 1.16k forks source link

Make queue-proxy NOT count the health check requests by the users #14581

Open houshengbo opened 1 year ago

houshengbo commented 1 year ago

Describe the feature

We leverage a platform managing all the knative services. All knative services report the health status via a separate URL/path, and the path is NOT consistent across different knative services.(Just confirmed internally with the team)

For example, the health check path of the knative service is determined by the model deployed in the knative service. Different models have different url for health check.

Each time, when the health check is called, the request is counted by the queue-proxy, which makes the metrics data NOT accurate, since all the health check requests are counted as well.

We are thinking of whether knative serving can distinguish the normal requests and the health check requests, so the health check requests do not count by the queue-proxy.

@skonto @dprotaso

skonto commented 1 year ago

Hi @houshengbo

Each time, when the health check is called, the request is counted by the queue-proxy, which makes the metrics data NOT accurate, since all the health check requests are counted as well.

Could you specify which metrics are not accurate, are these Knative metrics or external ones?
I see that we filter probes here and locally I didn't see these type of requests being counted.

Btw do you use KServe and https://github.com/kserve/kserve/blob/master/qpext/cmd/qpext/main.go?

houshengbo commented 9 months ago

@skonto We are using KServe to launch the inference services, which launch the knative services as the backend supporting services. The health check is from external of the inference services. The inference services usually have endpoint to call to run the health check. When there is a request coming to the inference service, no matter it is directly calling the inference service to serve or just checking the health. It will always be counted as the valid requests for that specific service, and the number is added into the metrics. In fact, it is more meaningful for us to count only the requests calling the inference service, not including the requests calling the health check endpoint.

I saw the filter here, is this a certain key for the header that we can use, so it will not be counted?

skonto commented 9 months ago

The health check is from external of the inference services.

Is that kubelet that checks the health of the isvc? The filter mentioned above checks for the K-Kubelet-Probe header (among others) and does not count the request for metrics reporting. Could you provide more details about how you use KServe and a reproducer for the issue? Could you show an example with KServe which I can test? For example, do you use https://github.com/kserve/kserve/tree/master/qpext? At the opendatahub side we have been discussing probes lately for KServe but haven't noticed something related.

yuzisun commented 9 months ago

@skonto We use consul for service discovery and consul sends health checks to the KServe inference services which are currently counted as normal request by Knative. Are you suggesting to add the K-Kubelet-Probe header to consul health checks ?

skonto commented 9 months ago

Hi @yuzisun I see a couple of options as I am not familiar with the architecture you have:

github-actions[bot] commented 6 months ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

skonto commented 5 months ago

/remove-lifecycle stale

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

skonto commented 2 months ago

/remove-lifecycle stale