kubernetes-retired / heapster

[EOL] Compute Resource Usage Analysis and Monitoring of Container Clusters
Apache License 2.0
2.63k stars 1.25k forks source link

Allow users to manually configure the Stackdriver sink to restore metric ingestion and stop the crash loop when not running in GCP. #2052

Closed bmoyles0117 closed 6 years ago

bmoyles0117 commented 6 years ago

What this PR does When Heapster is configured to use Stackdriver as its only sink, and the Kubernetes cluster is not running inside of GCP, Heapster repeatedly crashes, and no metrics are sent to Stackdriver, rendering metrics ingestion useless for Stackdriver monitoring customers with configurations like this.

This PR allows users to manually configure the Stackdriver sink to restore metric ingestion and stop the crash loop.

DirectXMan12 commented 6 years ago

Heapster is now deprecated, so no new features will be accepted. Please see https://github.com/kubernetes/heapster/blob/master/docs/deprecation.md for more information.

bmoyles0117 commented 6 years ago

@DirectXMan12 this change is a stability fix for the Stackdriver sink, when we run Heapster in any environment that is not GCE, Heapster crashes. Having this change allows us to mitigate this crash for our users when the GCE Metadata Server is unavailable. Is there any way I can get your help in approving this change, and validating the case? I really appreciate any input you have on this, and any alternatives you have to offer to help us release this critical stability fix to our users.

bmoyles0117 commented 6 years ago

Just clarifying a bit, by "our users", I'm totally being a bit selfish when I say that means "Stackdriver Users", or "Users that have enabled the Stackdriver sink", I didn't mean to conflate that this somehow affects heapster users not using this sink.

kawych commented 6 years ago

/ok-to-test LGTM once comments from Marian are resolved

k8s-ci-robot commented 6 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bmoyles0117, loburm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[metrics/sinks/stackdriver/OWNERS](https://github.com/kubernetes/heapster/blob/master/metrics/sinks/stackdriver/OWNERS)~~ [loburm] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment