grafana / grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
https://grafana.com
GNU Affero General Public License v3.0
64.67k stars 12.1k forks source link

Announcement banner: Database migration errors shown on homepage #95634

Open bcrisp4 opened 1 day ago

bcrisp4 commented 1 day ago

What happened?

I upgraded Grafana to 11.3.0. Since then, I get errors messages on the homepage and an unhandled error in the logs: Image

This appears to be triggered by requests to the new announcement-banners API: Image

I suspect a database migration related to this new feature is not respecting my TLS configuration so the server refuses the connection.

What did you expect to happen?

No errors.

Did this work before?

N/A

How do we reproduce it?

In my environment I use a PostgreSQL database with TLS. Config:

[database]
ca_cert_path = /etc/secrets/database/tls/ca.crt
client_cert_path = /etc/secrets/database/tls/tls.crt
client_key_path = /etc/secrets/database/tls/tls.key
host = $__env{DATABASE_HOST}
name = $__env{DATABASE_NAME}
password = $__env{DATABASE_PASSWORD}
server_cert_name = grafana
ssl_mode = verify-full
type = postgres
user = $__env{DATABASE_USER}

1.Run Grafana 11.3.0 with notificationBanner feature toggled on 2.Load the homepage

Disabling the notificationBanner seems to suppress the messages:

[feature_toggles]
notificationBanner = false

Is the bug inside a dashboard panel?

No.

Environment (with versions)?

Grafana: 11.3.0 OS: Ubuntu 20.04 Browser: Chrome 130

Grafana platform?

Kubernetes

Datasource(s)?

No response

inf0rmer commented 1 day ago

We’ve confirmed the root cause of this issue. It is caused by mistakenly trying to read [database][sslmode] from grafana.ini instead of reading [database][ssl_mode]. We’re going to issue a PR (update: here is the PR) to fix this ASAP and it will land in the next patch release.

There is a workaround: users can add a new config named [database][sslmode] and use whatever they have in [database][ssl_mode] as the value. Both config keys should be kept until the patch with the fix is out, at which point [database][sslmode] can be removed.

bcrisp4 commented 1 day ago

We’ve confirmed the root cause of this issue. It is caused by mistakenly trying to read [database][sslmode] from grafana.ini instead of reading [database][ssl_mode]. We’re going to issue a PR (update: here is the PR) to fix this ASAP and it will land in the next patch release.

There is a workaround: users can add a new config named [database][sslmode] and use whatever they have in [database][ssl_mode] as the value. Both config keys should be kept until the patch with the fix is out, at which point [database][sslmode] can be removed.

Thanks for looking into this.

I have tried the workaround. It works but I now hit a different but similar error: Image

Config:

[database]
ca_cert_path = /etc/secrets/database/tls/ca.crt
client_cert_path = /etc/secrets/database/tls/tls.crt
client_key_path = /etc/secrets/database/tls/tls.key
host = $__env{DATABASE_HOST}
name = $__env{DATABASE_NAME}
password = $__env{DATABASE_PASSWORD}
server_cert_name = grafana
ssl_mode = verify-full
sslmode = verify-full
type = postgres
user = $__env{DATABASE_USER}

I assume it's the same issue - missing/misnamed config parameters - but this time for the certificate-related parameters. In the case of this particular error, ca_cert_path. But I would guess client_cert_path, client_key_path and server_cert_name may also be broken.