canonical / grafana-k8s-operator

https://charmhub.io/grafana-k8s
Apache License 2.0
6 stars 22 forks source link

Fetch-lib. Fix for #315 #327

Closed sed-i closed 5 months ago

sed-i commented 5 months ago

315 had an issue with tls_certificates v3. Checking here so noctua doesn't force push on my commit.

In this PR:

sed-i commented 5 months ago

Tests passed locally, but grafana had a brief error:

unit-grafana-1: 13:58:18 ERROR unit.grafana/1.juju-log certificates:2: Could not restart grafana at this time: cannot perform the following tasks:
- Start service "grafana" (cannot start service: exited quickly with code 1)
----- Logs from task 0 -----
2024-04-25T17:58:18Z INFO Most recent service output:
    (...)
    logger=provisioning.plugins t=2024-04-25T17:58:18.516812528Z level=error msg="Failed to read plugin provisioning files from directory" path=/etc/grafana/provisioning/plugins error="open /etc/grafana/provisioning/plugins: no such file or directory"
    logger=provisioning.notifiers t=2024-04-25T17:58:18.516834278Z level=error msg="Can't read alert notification provisioning files from directory" path=/etc/grafana/provisioning/notifiers error="open /etc/grafana/provisioning/notifiers: no such file or directory"
    logger=provisioning.alerting t=2024-04-25T17:58:18.516849248Z level=error msg="can't read alerting provisioning files from directory" path=/etc/grafana/provisioning/alerting error="open /etc/grafana/provisioning/alerting: no such file or directory"
    logger=provisioning.alerting t=2024-04-25T17:58:18.516855357Z level=info msg="starting to provision alerting"
    logger=provisioning.alerting t=2024-04-25T17:58:18.516862268Z level=info msg="finished to provision alerting"
    logger=modules t=2024-04-25T17:58:18.516971128Z level=warn msg="No modules registered..."
    logger=ngalert.state.manager t=2024-04-25T17:58:18.516974078Z level=info msg="Warming state cache for startup"
    logger=grafanaStorageLogger t=2024-04-25T17:58:18.517046737Z level=info msg="storage starting"
    logger=ngalert.state.manager t=2024-04-25T17:58:18.517146777Z level=info msg="State cache has been initialized" states=0 duration=172.249µs
    logger=ngalert.multiorg.alertmanager t=2024-04-25T17:58:18.517188197Z level=info msg="starting MultiOrg Alertmanager"
    logger=ticker t=2024-04-25T17:58:18.517197237Z level=info msg=starting first_tick=2024-04-25T17:58:20Z
    logger=server t=2024-04-25T17:58:18.518468975Z level=error msg="Stopped background service" service=*api.HTTPServer reason="cert_file cannot be empty when using HTTPS"
    logger=tracing t=2024-04-25T17:58:18.518571835Z level=info msg="Closing tracing"
    logger=ticker t=2024-04-25T17:58:18.518727614Z level=info msg=stopped last_tick=2024-04-25T17:58:10Z
    logger=grafana.update.checker t=2024-04-25T17:58:18.518759164Z level=error msg="Update check failed" error="failed to get latest.json repo from github.com: Get \"https://raw.githubusercontent.com/grafana/grafana/main/latest.json\": context canceled" duration=1.539637ms
    logger=auth t=2024-04-25T17:58:18.521339439Z level=error msg="failed to lock and execute cleanup of expired auth token" error="context canceled"
    logger=serviceaccounts t=2024-04-25T17:58:18.522056807Z level=warn msg="Failed to get usage metrics" error="context canceled"
    logger=infra.usagestats t=2024-04-25T17:58:18.522125307Z level=error msg="Failed to get last sent time" error="context canceled"
    logger=infra.usagestats.collector t=2024-04-25T17:58:18.522235617Z level=error msg="Failed to get system stats" error="context canceled"
    Error: ✗ *api.HTTPServer run error: cert_file cannot be empty when using HTTPS
2024-04-25T17:58:18Z ERROR cannot start service: exited quickly with code 1

Specifically:

Seems like the first two are non-critical errors, and the empty cert_file is what brings it down.

Context:

25 Apr 2024 13:56:42-04:00  juju-unit  allocating   
25 Apr 2024 13:56:42-04:00  workload   waiting      installing agent
25 Apr 2024 13:56:54-04:00  workload   waiting      agent initialising
25 Apr 2024 13:57:28-04:00  workload   maintenance  installing charm software
25 Apr 2024 13:57:28-04:00  juju-unit  executing    running install hook
25 Apr 2024 13:57:30-04:00  juju-unit  executing    running grafana-relation-created hook
25 Apr 2024 13:57:30-04:00  juju-unit  executing    running certificates-relation-created hook
25 Apr 2024 13:57:31-04:00  juju-unit  executing    running replicas-relation-created hook
25 Apr 2024 13:57:32-04:00  juju-unit  executing    running leader-elected hook
25 Apr 2024 13:57:33-04:00  juju-unit  executing    running grafana-pebble-ready hook
25 Apr 2024 13:57:37-04:00  juju-unit  executing    running litestream-pebble-ready hook
25 Apr 2024 13:57:38-04:00  juju-unit  executing    running database-storage-attached hook
25 Apr 2024 13:58:04-04:00  juju-unit  executing    running config-changed hook
25 Apr 2024 13:58:08-04:00  juju-unit  executing    running start hook
25 Apr 2024 13:58:10-04:00  workload   unknown      
25 Apr 2024 13:58:10-04:00  juju-unit  executing    running litestream-pebble-ready hook
25 Apr 2024 13:58:10-04:00  juju-unit  executing    running grafana-pebble-ready hook

unit-grafana-1: 13:58:12 INFO unit.grafana/1.juju-log Restarted grafana-k8s
unit-grafana-1: 13:58:13 INFO unit.grafana/1.juju-log Restarted grafana-k8s
unit-grafana-1: 13:58:14 INFO unit.grafana/1.juju-log Initializing dashboard provisioning path
unit-grafana-1: 13:58:15 INFO unit.grafana/1.juju-log Restarted grafana-k8s

25 Apr 2024 13:58:15-04:00  juju-unit  executing    running replicas-relation-joined hook for grafana/0

(BOOM)

unit-grafana-1: 13:58:18 ERROR unit.grafana/1.juju-log certificates:2: Could not restart grafana at this time: cannot perform the following tasks:
- Start service "grafana" (cannot start service: exited quickly with code 1)
...

25 Apr 2024 13:58:16-04:00  juju-unit  executing    running certificates-relation-changed hook
25 Apr 2024 13:58:19-04:00  juju-unit  executing    running certificates-relation-joined hook for ca/0
25 Apr 2024 13:58:19-04:00  juju-unit  idle         
25 Apr 2024 13:58:19-04:00  juju-unit  executing    running replicas-relation-changed hook for grafana/0
25 Apr 2024 13:58:20-04:00  juju-unit  executing    running certificates-relation-changed hook for ca/0
25 Apr 2024 13:58:23-04:00  juju-unit  idle         
25 Apr 2024 13:58:24-04:00  juju-unit  executing    running grafana-relation-changed hook
25 Apr 2024 13:58:25-04:00  juju-unit  idle         
25 Apr 2024 13:58:25-04:00  juju-unit  executing    running grafana-relation-joined hook for grafana/0
25 Apr 2024 13:58:26-04:00  juju-unit  executing    running grafana-relation-changed hook for grafana/0
25 Apr 2024 13:58:58-04:00  juju-unit  idle         
25 Apr 2024 13:58:59-04:00  juju-unit  executing    running replicas-relation-changed hook for grafana/0
25 Apr 2024 13:59:00-04:00  juju-unit  executing    running certificates-relation-changed hook
25 Apr 2024 13:59:10-04:00  juju-unit  idle         
25 Apr 2024 13:59:10-04:00  workload   active       
25 Apr 2024 13:59:16-04:00  juju-unit  executing    running replicas-relation-changed hook for grafana/0
25 Apr 2024 13:59:17-04:00  juju-unit  idle         
25 Apr 2024 14:00:25-04:00  workload   maintenance  stopping charm software
25 Apr 2024 14:00:25-04:00  juju-unit  executing    running stop hook
25 Apr 2024 14:00:26-04:00  workload   maintenance  Application is terminating.
25 Apr 2024 14:00:26-04:00  workload   maintenance  
25 Apr 2024 14:01:03-04:00  juju-unit  executing    running upgrade-charm hook
25 Apr 2024 14:01:15-04:00  juju-unit  executing    running config-changed hook
25 Apr 2024 14:01:20-04:00  juju-unit  executing    running start hook
25 Apr 2024 14:01:22-04:00  juju-unit  executing    running grafana-pebble-ready hook
25 Apr 2024 14:01:31-04:00  juju-unit  idle         
25 Apr 2024 14:01:31-04:00  juju-unit  executing    running litestream-pebble-ready hook
25 Apr 2024 14:01:32-04:00  juju-unit  executing    running certificates-relation-changed hook
25 Apr 2024 14:01:35-04:00  juju-unit  idle         
25 Apr 2024 14:01:36-04:00  juju-unit  executing    running grafana-relation-changed hook
25 Apr 2024 14:01:37-04:00  juju-unit  idle         
25 Apr 2024 14:02:07-04:00  juju-unit  executing    running replicas-relation-changed hook for grafana/0
25 Apr 2024 14:02:07-04:00  juju-unit  executing    running certificates-relation-changed hook
25 Apr 2024 14:02:20-04:00  juju-unit  idle         
25 Apr 2024 14:02:20-04:00  workload   active       
25 Apr 2024 14:02:23-04:00  juju-unit  executing    running replicas-relation-changed hook for grafana/0
25 Apr 2024 14:02:24-04:00  juju-unit  idle