Closed aLekSer closed 4 years ago
We can switch from getMonitoredResource() function to monitoredresource.Autodetect()
after updating to version 0.22 of opencensus and contrib.go.opencensus.io/exporter/stackdriver v0.12.0
as it is done in latest example here:
https://github.com/census-ecosystem/opencensus-go-exporter-stackdriver/blob/6ee7f9652d2a9e707fea22c56d06235db6289426/examples/stats/main.go#L51
@bbf Have you seen this on recent releases?
While I have not tested any recent releases, I can imagine why a few things stopped working. I'm very interested in overhauling the Stackdriver integration of Agones, so if possible give me some time to look into it.
It was already in my plans to propose some changes to have a better alignment between Agones and the new monitoring agent used by GKE on Stackdriver, so addressing that while fixing this bug might be ideal.
@markmandel / @aLekSer WTDY?
Hello, I managed to make it working by updating exporter's Monitored resource yesterday. I will send a PR, it involves update of the Opencensus to 0.22 this update slow me down a bit. @bbf I will send a draft PR soon so you can review
Well switching to AutoDetect() was not working on recent OpenCensus and stackdriver-exporter as well:
https://github.com/census-ecosystem/opencensus-go-exporter-stackdriver/blob/master/monitoredresource/gcp_metadata_config.go#L100
I will rewrite getMonitoredResource()
for a fast fix.
And then need to understand why Autodetect():
resT, lab := monitoredresource.Autodetect().MonitoredResource()
logger.Info("Monitored Resource: ", resT, " ", lab)
returns on test-cluster GKE:
Monitored Resource: gke_container map[cluster_name:test-cluster container_name:agones-controller instance_id:1205178163407041488 namespace_id: pod_id:agones-controller-59bd95c448-dwp88 project_id:agones-alexander zone:us-west1-c]
While working scenario is k8s_container
as in upcoming PR
Also we receive errors for Prometheus exporter:
textPayload: "2020/02/07 15:14:14 Failed to export to Prometheus: inconsistent label cardinality: expected 1 label values but got 0 in []string(nil)
Which seems to be https://github.com/census-instrumentation/opencensus-go/issues/659 with a fix https://github.com/census-instrumentation/opencensus-go/pull/989
I defer these things to you two :smile: my knowledge of metrics is very low.
I definitely advocate for a working solution :grin:
@cyriltovena have you got any feedback here?
Currently on Master Prometheus is working, but contains such error message in Agones Controller logs. PR #1335 adds working stackdriver. Update to OpenCensus 0.22 could be done after this fix, to split up the process. I thought to update in single PR, but as in #893 all tests should be updated.
Is this fixed now?
Stackdriver would be fixed after PR, now I am grabbing screenshots from Grafana to compare with a previous one made by @cyriltovena as part of #1479
There are no stackdriver metrics due to error in Labels:
What happened:
No stackdriver metrics on the dashboard, which was working several month ago. New errors on Agones Controller logs.
What you expected to happen: Stackdriver metrics are correctly visualised.
How to reproduce it (as minimally and precisely as possible): https://agones.dev/site/docs/guides/metrics/#stackdriver-installation
What should be done to fix an issue
Global
butKubernetes Container
resource for each of gamesever, and upload new screenshot from Stackdriver.view.Distribution
with zero range value.Anything else we need to know?: There are two Pull requests which solved mentioned above ticket They contain fixes for:
Environment:
kubectl version
):