Open tomassommareqt opened 1 year ago
Thanks @tomassommareqt. FWIW I have seen the same logs when working on this feature. I don't expect these logs to show up outside of a dev context, though. We'll investigate and fix this.
still seen in 2.9.0 although the metrics work using metrics writer role
2024/03/15 23:10:50 Failed to export to Stackdriver: rpc error: code = PermissionDenied desc = The caller does not have permission
Thanks, @rojomisin. We still haven't got to this. I wonder if this is race condition in OpenCensus itself.
perhaps fixed in OpenTelemetry pkg? https://github.com/open-telemetry/opentelemetry-go-contrib
Quite possibly. We're currently using OpenCensus given that some internal tooling that uses the Proxy has a big investment in OpenCensus. But we might revisit that decision now that OpenTelemetry's metrics package is stable.
We will be migrating OpenTelemetry in the somewhat near future which will hopefully resolve this issue...
Bug Description
We are running gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.7.0 as a container next to our main http api container for connectivity to our CloudSQL instance.
After enabling telemetry using the
--telemetry-project
and-telemetry-prefix
flags we have recurrently gotten the following error logged:2023/11/04 13:58:43 Failed to export to Stackdriver: rpc error: code = Internal desc = One or more TimeSeries could not be written: Internal error encountered. Please retry after a few seconds. If internal errors persist, contact support at https://cloud.google.com/support/docs.: global{} timeSeries[0]: custom.googleapis.com/opencensus/<redacted>_cloud_sql_proxy/cloudsqlconn/refresh_success_count{opencensus_task:go-1@<redacted>,cloudsql_instance:<redacted>}; Internal error encountered. Please retry after a few seconds. If internal errors persist, contact support at https://cloud.google.com/support/docs.: global{} timeSeries[1]: custom.googleapis.com/opencensus/<redacted>_cloud_sql_proxy/cloudsqlconn/dial_latency{cloudsql_instance:<redacted>,opencensus_task:go-1@<redacted>}
However when expecting the metrics we can see that it works as expected. So this is mostly causes the issue of polluted logs. But it would also be interesting to understand why this error is reported.
Example code (or command)
Stacktrace
Steps to reproduce?
Environment
Additional Details
No response