GoogleCloudPlatform / k8s-stackdriver

Apache License 2.0
391 stars 212 forks source link

pod autoscaling based on Stackdriver predefined metrics. #78

Open dzolotusky opened 6 years ago

dzolotusky commented 6 years ago

I'd like to use k8s-stackdriver to do pod autoscaling based on metrics that already exist in Stackdriver's predefined metrics list. For example, using Cloud Pub/Sub's subscription/num_undelivered_messages as the metric to autoscale on.

Is this possible with what exists today or what is planned for this project, or is this only intended for custom metrics?

x13n commented 6 years ago

I think this will use custom metrics, but CCing folks that should know better.

@kawych @MaciekPytel

dzolotusky commented 6 years ago

Thanks. I'd like to understand if there is a plan for supporting predefined metrics. If there is, I'd like to contribute to making it happen. If not, I'd like to work on one.

MaciekPytel commented 6 years ago

Hi @dzolotusky, I'm looking into supporting predefined metrics right now. It will likely require significant work on kubernetes side, so I'm trying to get it in for 1.10. I can link a proposal PR here once I get it ready for review (late next week or so, I think).

In 1.9 you can hack around to get it working, but it won't be pretty. Basically you can already scale based on a global custom metric coming from a pod in k8s ("object metric" in https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/). So you could create a pod re-exporting the predefined metric and scale based on that. The problems with this approach (other than obvious ugliness) are:

  1. You need a bit of custom work to reexport the metric. https://github.com/GoogleCloudPlatform/k8s-stackdriver/tree/master/custom-metrics-stackdriver-adapter/examples/direct-to-sd is probably a good starting point.
  2. You can only define targetValue for object metric, not average target per pod. So you can say your target is to have 10 undelivered messages in pubsub and if you have 20 HPA will double your number of workers. You can't say 'I want a worker for each 20 undelivered messages'.

Feel free to post here or just ping me on slack if you want to discuss this in some more details (I'm in CET timezone).

dzolotusky commented 6 years ago

@MaciekPytel Thanks, I'll wait for the proposal PR before pushing to discuss this further as the proposal will likely answer many of my initial questions. If you need any help as you're crafting the proposal, feel free to ping me on Slack. (I'm also in the CET timezone)

dzolotusky commented 6 years ago

@MaciekPytel any update on this? Anything that I can do to help?

nareshgnT commented 6 years ago

1.10 is out, is it released?, if so please point to any documentation on it.

arunk2 commented 6 years ago

1.10 is out. You can scale based on any metrics available in stackdrive. Detailed tutorial on pub/sub based scaling can be found in below link

https://cloud.google.com/kubernetes-engine/docs/tutorials/external-metrics-autoscaling