grafana / grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
https://grafana.com
GNU Affero General Public License v3.0
64.14k stars 12k forks source link

Alerting: Manual Alert Rule Update not working #69848

Open blackswan1 opened 1 year ago

blackswan1 commented 1 year ago

What went wrong?

What happened:

After creating a new alert rule via Provisioning API (Grafana v9.5.3) I'm no longer unable to manually save the alert rule within Grafana Web UI. The "X-Disable-Provenance" header is set while creating the alert rule.

Following error occurs in the UI:

Failed to save rule: failed to update rule group: request affects resources created via provisioning API: alert rule group [{orgID: 0, namespaceUID: xxx-xxxx, groupName: xxx}]

What did you expect to happen:

I expect that the alert rule can be saved successfully.

How do we reproduce it?

Step 1:

Step 2:

What Grafana version are you using?

Grafana: 9.5.3

Optional Questions:

Is the bug inside a Dashboard Panel?

Copy the panel's "get-help" data here

Grafana Platform?

A downloaded binary

User's OS?

RedHat

User's Browser?

Microsoft Edge Version 111.0.1661.41 (64-Bit)

Is this a Regression?

None

Are Datasources involved?

yes, Elasticsearch

Anything else to add?

No response

tonypowa commented 1 year ago

hi @blackswan1

thank you for creating this issue

can you provide the request in cURL or the JSON that your are POSTing ?

Thank you

blackswan1 commented 1 year ago

hi @tonypowa

Below the POST request. Some settings like hostname and elasticsearch query has been anonymized.

curl -X POST \
  https://HOSTNAME.XXX.YYY.net/api/v1/provisioning/alert-rules \
  -H 'Cache-Control: no-cache' \
  -H 'Content-Type: application/json' \
  -H 'Postman-Token: 1f25da43-998b-2225-2e4b-e8475191d2c5' \
  -d '{
    "orgID": 1,
    "folderUID": "ypF-o754z",
    "ruleGroup": "test",
    "title": "TESTING",
    "condition": "C",
    "data": [
        {
            "refId": "A",
            "queryType": "",
            "relativeTimeRange": {
                "from": 600,
                "to": 0
            },
            "datasourceUid": "zSCVTnc4k",
            "model": {
                "alias": "",
                "bucketAggs": [
                    {
                        "field": "@timestamp",
                        "id": "2",
                        "settings": {
                            "interval": "10m"
                        },
                        "type": "date_histogram"
                    }
                ],
                "hide": false,
                "intervalMs": 1000,
                "maxDataPoints": 43200,
                "metrics": [
                    {
                        "field": "status",
                        "id": "1",
                        "type": "max"
                    }
                ],
                "query": "_<query>_)",
                "refId": "A",
                "timeField": "@timestamp"
            }
        },
        {
            "refId": "B",
            "queryType": "",
            "relativeTimeRange": {
                "from": 600,
                "to": 0
            },
            "datasourceUid": "-100",
            "model": {
                "conditions": [
                    {
                        "evaluator": {
                            "params": [],
                            "type": "gt"
                        },
                        "operator": {
                            "type": "and"
                        },
                        "query": {
                            "params": [
                                "B"
                            ]
                        },
                        "reducer": {
                            "params": [],
                            "type": "last"
                        },
                        "type": "query"
                    }
                ],
                "datasource": {
                    "type": "__expr__",
                    "uid": "-100"
                },
                "expression": "A",
                "hide": false,
                "intervalMs": 1000,
                "maxDataPoints": 43200,
                "reducer": "last",
                "refId": "B",
                "settings": {
                    "mode": "replaceNN",
                    "replaceWithValue": 0
                },
                "type": "reduce"
            }
        },
        {
            "refId": "C",
            "queryType": "",
            "relativeTimeRange": {
                "from": 600,
                "to": 0
            },
            "datasourceUid": "-100",
            "model": {
                "conditions": [
                    {
                        "evaluator": {
                            "params": [
                                1
                            ],
                            "type": "gt"
                        },
                        "operator": {
                            "type": "and"
                        },
                        "query": {
                            "params": [
                                "C"
                            ]
                        },
                        "reducer": {
                            "params": [],
                            "type": "last"
                        },
                        "type": "query"
                    }
                ],
                "datasource": {
                    "type": "__expr__",
                    "uid": "-100"
                },
                "expression": "B",
                "hide": false,
                "intervalMs": 1000,
                "maxDataPoints": 43200,
                "refId": "C",
                "type": "threshold"
            }
        }
    ],
    "updated": "2023-06-09T11:06:35+02:00",
    "noDataState": "Alerting",
    "execErrState": "Error",
    "for": "10m",
    "labels": {
        "datasource": "_<datasource>_",
        "instance": "<instance>"
    },
    "isPaused": false
}'
tonypowa commented 1 year ago

Thanks @blackswan1

I have reproduced creating the uneditable alert

Sending this issue to the alerting squad for review

armandgrillet commented 1 year ago

The alerting squad has reproduced this. Once at least one alert rule is provisioned without "X-Disable-Provenance", the alert rule group becomes provisioned (we can see the label in the UI). Once this is the case, no alert rule in that group can be edited. Even the ones created via the UI and with "X-Disable-Provenance" set to none. This is currently badly handled in the UI.

The issue is coming from the API layer thus we are not thinking about allowing edits of alert rules in a provisioned alert rule group. But there are many ways to fix this:

This is a bug and there are multiple ways to make the UX nicer, we will work on this in the coming weeks. Right now, the best you can do is to have all alert rules set with "X-Disable-Provenance" disabled so that the alert rule group stops being marked as "provisioned", thus allowing edits of its alert rules.

LibiKorol commented 1 year ago

Hi,

all of our alert are provisioned and not editable via UI. we would like to allow to edit them via UI so we tried to add X-Disable-Provenance with POST to /api/v1/provisioning/alert-rules but received "bad request data". we tried the same with PUT /api/v1/provisioning/folder/{FolderUID}/rule-groups/{Group} but received the same error.

Which value should X-Disable-Provenance should receive? Is there another way to change all alert rules to be editable via UI?

YouZhengChuan commented 6 months ago

Hi,

all of our alert are provisioned and not editable via UI. we would like to allow to edit them via UI so we tried to add X-Disable-Provenance with POST to /api/v1/provisioning/alert-rules but received "bad request data". we tried the same with PUT /api/v1/provisioning/folder/{FolderUID}/rule-groups/{Group} but received the same error.

Which value should X-Disable-Provenance should receive? Is there another way to change all alert rules to be editable via UI?

I found a solution to this problem, it's very simple:

  1. Log in to mysql used by grafana

  2. From the provenance_type table, delete the record_key field whose value is the "uid" of the alarm you need to edit, like this:

    delete from provenance_type where `record_key` = 'a27f7549-e368-4562-bd26-fda596d25c39';
  3. Refresh the dashboard and you will find that the alarm becomes editable.

dsamborschi commented 1 week ago

X-Disable-Provenance = disabled works in 11.2.0