knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.5k stars 1.14k forks source link

Don't scrape pods via HTTP if the activator in path. #7324

Open markusthoemmes opened 4 years ago

markusthoemmes commented 4 years ago

When the activator is in the routing path, scraping the pods is kinda useless as their information is effectively zeroed out, because all requests are considered "proxied", thus the activator metrics take precedence. We can use that to make our system more efficient in these cases!

I propose two steps to go about this:

  1. Never scrape if TBC=-1. The activator will always be in path in this case, so we never need to scrape really.
  2. Start/Stop scraping if activator is in path or not. This is a little more tricky as we need to make sure that we start scraping timely once we notice the activator will be taken off the path.

As the PodAutoscaler owns all of these decisions, this should be doable for us. As it's a good-first-issue, I'd love to see it land in two separate pieces as laid out though.

Note: I'm happy to mentor first-time contributors here. The changes needed should be fairly small in terms of number of files to touch.

knative-prow-robot commented 4 years ago

@markusthoemmes: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to [this](https://github.com/knative/serving/issues/7324): > > >When the activator is in the routing path, scraping the pods is kinda useless as their information is effectively zeroed out, because all requests are considered "proxied", thus the activator metrics take precedence. We can use that to make our system more efficient in these cases! > >I propose two steps to go about this: >1. Never scrape if TBC=-1. The activator will always be in path in this case, so we never need to scrape really. >2. Start/Stop scraping if activator is in path or not. This is a little more tricky as we need to make sure that we start scraping timely once we notice the activator will be taken off the path. > >As the PodAutoscaler owns all of these decisions, this should be doable for us. As it's a good-first-issue, I'd love to see it land in two separate pieces as laid out though. > >**Note:** I'm happy to mentor first-time contributors here. The changes needed should be fairly small in terms of number of files to touch. > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
dsimansk commented 4 years ago

I'd like to give it a try to look into Serving internals a bit more. /assign dsimansk

vagababov commented 4 years ago

/cc @taragu BTW, this is also one of the ways you can improve GSD, if we extend the activator reporting capabilities (not necessarily a good idea, but something to consider).

markusthoemmes commented 4 years ago

The rest of this is not a good first issue anymore and requires a fair bit of caution. I'm happy to help people through it, but it's much more involved. Also kinda blocked on #8377 to refactor the scraper in a way to make controlling it's lifetime more predictable.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

vagababov commented 3 years ago

/lifecycle frozen

Probably still want this

evankanderson commented 3 years ago

/triage accepted

nader-ziada commented 2 years ago

/assign

dprotaso commented 2 years ago

Related PR: https://github.com/knative/serving/pull/13027

If there are no ready activators but ready pod endpoints requests will bypass the activator even if TBC=-1

dprotaso commented 7 months ago

/unassign @nader-ziada

yenniechen commented 2 months ago

Kindly to ask if this issue still open?If is, I’m glad to try.

@dprotaso @markusthoemmes

yenniechen commented 2 months ago

/assign