Grafana Database is locked suddenly

tonypowa commented 3 days ago

What happened?

Grafana service down suddenly and returning message failed to check table existence: database is locked

What did you expect to happen?

Grafana service resume

Did this work before?

I am using Grafana OSS v10.3.1

How do we reproduce it?

Delete deployment from K8s

Create Grafana base on yaml

Returning error

logger=migrator t=2024-05-24T06:16:58.463867446Z level=info msg="Starting DB migrations"

Error: ✗ failed to check table existence: database is locked

Is the bug inside a dashboard panel?

No response

Environment (with versions)?

Grafana: 10.3.1

OS: K8s

Browser: Chrome

Grafana platform?

Kubernetes

Datasource(s)?

No response

tonypowa commented 3 days ago

This issue is a test copy of an issue in another repo. Original issue: https://github.com/grafana/grafana/issues/#88263

moxious commented 3 days ago

Hi @tonypowa, it seems like you're encountering a database-related issue with Grafana. This could be due to various reasons such as concurrent writes or long-running transactions. To better assist you, could you please provide additional details or logs if available? Also, ensure that you're not using SQLite in a multi-threaded environment, as it might not handle concurrent access well. I will direct this issue to the Grafana Backend platform project, as it's related to Grafana's core service operation. The team responsible for backend issues will be better equipped to help you diagnose and resolve this problem.

moxious commented 3 days ago

Summary: An open issue with Grafana OSS v10.3.1 reports a 'database is locked' error leading to service downtime. The user expects the service to resume and the problem occurs when deleting and recreating a Grafana deployment on Kubernetes.

moxious commented 3 days ago

Elaboration:

Hello @tonypowa, thank you for bringing this issue to our attention. To help us address the problem with Grafana more effectively, could you please provide a bit more information? Here's what would be helpful for us to proceed with troubleshooting your issue:

Detailed Reproduction Steps: While you've mentioned deleting the deployment and creating Grafana based on a YAML file, it would be helpful if you could provide the exact YAML file (with any sensitive information redacted). This will help us understand if there might be any configuration issues that are causing the lock.
Logs: Can you provide a fuller set of log entries before and after the database is locked error message? It will enable us to see if there are any prior warning signs or actions that could lead to this state.
Persistence Details: Are you using persistent volumes for your Grafana deployment? If so, please provide details about the storage backend and the PersistentVolumeClaim configuration.
Error Frequency: Does the issue happen every time you create the Grafana deployment, or was this a one-time error?
Resource Checks: Have you checked the resources on your Kubernetes nodes during deployment to ensure they're not being maxed out?
Kubernetes Version: Could you specify which version of Kubernetes you are running?
Grafana Configuration File: If you've made any custom configurations in the Grafana ini or other configuration files, please share those details.

By providing this additional information, we will be better equipped to help troubleshoot the issue with you. Thanks again for your cooperation!

moxious / triage