grafana / oncall

Developer-friendly incident response with brilliant Slack integration
GNU Affero General Public License v3.0
3.53k stars 292 forks source link

1.9.15 cannot connect to grafana(11.1.0) - "Plugin is not connected" #4960

Open daasol opened 2 months ago

daasol commented 2 months ago

What went wrong?

Hello,

I am currently using Grafana OSS version 11.1.0 and have deployed the Grafana OnCall plugin(1.9.15).

What happened:

grafana.log

logger=context userId=2 orgId=1 uname=sa-1-sa-autogen-oncall t=2024-08-30T08:34:02.990990635Z level=info msg="Request Completed" method=GET path=/api/plugins/grafana-incident-app/settings status=404 remote_addr=10.201.36.203 time_ms=39 duration=39.011119ms size=64 referer= handler=/api/plugins/:pluginId/settings status_source=server
logger=plugin.grafana-oncall-app t=2024-08-30T08:34:02.992883818Z level=error msg="getting incident plugin settings" error="request did not return 200: 404"
logger=context userId=2 orgId=1 uname=sa-1-sa-autogen-oncall t=2024-08-30T08:34:03.054445886Z level=info msg="Request Completed" method=GET path=/api/plugins/grafana-labels-app/settings status=404 remote_addr=10.201.36.203 time_ms=38 duration=38.319328ms size=64 referer= handler=/api/plugins/:pluginId/settings status_source=server
logger=plugin.grafana-oncall-app t=2024-08-30T08:34:03.056015885Z level=error msg="getting labels plugin settings" error="request did not return 200: 404"
logger=plugin.grafana-oncall-app t=2024-08-30T08:34:03.126314798Z level=info msg=GetSyncData time=65
logger=plugin.grafana-oncall-app t=2024-08-30T08:34:03.175113158Z level=error msg="Error unmarshalling OnCallError" error="invalid character '<' looking for beginning of value"
logger=plugin.grafana-oncall-app t=2024-08-30T08:34:03.205154621Z level=info msg=GetUser user="map[Email:admin@localhost Login:admin Name:admin Role:Admin]"

engine log

2024-08-30 08:32:41 source=engine:app google_trace_id=none logger=root inbound latency=0.000621 status=200 method=GET path=/api/internal/v1/health/ user_agent=Go-http-client/1.1 content-length=0 slow=0
2024-08-30 08:32:41 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/health/ latency=0.002093 google_trace_id=- protocol=HTTP/1.1 resp_size=221 req_body_size=0
2024-08-30 08:32:41 source=engine:app google_trace_id=none logger=apps.social_auth.middlewares SocialAuthAuthCanceledExceptionMiddleware.process_exception: Object of type OnCallError is not JSON serializable
2024-08-30 08:32:41 source=engine:app google_trace_id=none logger=django.request Internal Server Error: /api/internal/v1/plugin/v2/install
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/base.py", line 220, in _get_response
    response = response.render()
               ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/template/response.py", line 114, in render
    self.content = self.rendered_content
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/response.py", line 74, in rendered_content
    ret = renderer.render(self.data, accepted_media_type, context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/renderers.py", line 100, in render
    ret = json.dumps(
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/json.py", line 25, in dumps
    return json.dumps(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/encoders.py", line 67, in default
    return super().default(obj)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type OnCallError is not JSON serializable

The configuration(oncall values.yaml) is as follows, and all pods are running normally.

base_url: grafana-oncall-test.xxx.xx
base_url_protocol: https

# Celery workers pods configuration
celery:
  replicaCount: 1
  worker_queue: "default,critical,long,slack,telegram,webhook,celery,grafana,retry"
  worker_concurrency: "1"
  worker_max_tasks_per_child: "100"
  worker_beat_enabled: "True"
  worker_shutdown_interval: "65m"
  livenessProbe:
    enabled: true
    initialDelaySeconds: 30
    periodSeconds: 300
    timeoutSeconds: 10

oncall:
  devMode: false
  secrets:
    existingSecret: oncall-secret
    secretKey: secretKey
    mirageSecretKey: mirageSecretKey
  slack:
    enabled: false
  telegram:
    enabled: false
  smtp:
    enabled: false
  exporter:
    enabled: false

migrate:
  enabled: true
  useHook: false

ingress:
  enabled: true
  className: traefik-default
  tls:
    - hosts:
        - "{{ .Values.base_url }}"
      secretName: xxx-xx-tls

ingress-nginx:
  enabled: false

cert-manager:
  enabled: false

database:
  type: postgresql

mariadb:
  enabled: true

postgresql:
  #enabled: false
  enabled: true
  auth:
    database: oncall
    existingSecret: postgresql-secret

rabbitmq:
  enabled: true
  auth:
    existingPasswordSecret: rabbitmq-secret

broker:
  type: rabbitmq

redis:
  enabled: true
  auth:
    existingSecret:

grafana:
  enabled: true
  grafana.ini:
    server:
      domain: grafana-oncall-test.xxx.xx
      root_url: https://grafana-oncall-test.xxxx.xx/grafana #"%(protocol)s://%(domain)s/grafana"
      serve_from_sub_path: true
      cert_file: /etc/grafana/tls.crt
      cert_key: /etc/grafana/tls.key

  persistence:
    enabled: true
  rbac:
    pspEnabled: false

  plugins:
    - grafana-oncall-app
  extraSecretMounts:
    - name: tls-cert
      mountPath: /etc/grafana/tls.crt
      secretName: xxx-xx-tls
      readOnly: true
      subPath: tls.crt
    - name: tls-key
      mountPath: /etc/grafana/tls.key
      secretName: xxx-xx-tls
      readOnly: true
      subPath: tls.key
  image:
    repository: grafana/grafana
  initChownData:
    image:
      repository: library/busybox
  testFramework:
    image:
      repository: bats/bats

  datasources:
    oncall.yaml:
      apiVersion: 1
      apps:
        - type: grafana-oncall-app
          org_id: 1
          enabled: true
          jsonData:
            stackId: 5
            orgId: 1
            onCallApiUrl: http://grafana-oncall-test-engine:8080

prometheus:
  enabled: false

Here are the steps I followed to add the plugin in Grafana:

Image

  1. Enabled the plugin
  2. Registered the OnCall API URL
  3. Tested the connection using the 'Connect' button
  4. However, when testing the connection with the 'Connect' button, a 403 error occurs.

I have some additional questions: How does the Service Account created when enabling the plugin in Grafana interact with OnCall?

I believe the Service Account generated by Grafana needs to be mounted as a secret in OnCall. Which environment variable should be used to assign Grafana’s Service Account in OnCall?

And most importantly, I cannot find a way to view the actual token value of the Service Account automatically created by Grafana. Do you know how to resolve this issue?

How do we reproduce it?

  1. Enabled the plugin in grafana
  2. Registered the OnCall API URL
  3. Tested the connection using the 'Connect' button

A 403 error occurs.

Grafana OnCall Version

1.9.15

Product Area

Alert Flow & Configuration

Grafana OnCall Platform?

Kubernetes

User's Browser?

chrom 128.0.6613.114

Anything else to add?

No response

krisht8485 commented 2 months ago

Getting same exact error with Grafana enterprise version and with any Grafana OnCall Version above 1.9.* The same is working with and below 1.8.13 .

eriklaco commented 2 months ago

I got into a similar issue while I was trying to install the plugin. The error in the Grafana plugin configuration was Plugin is not connected and when I checked on-call engine logs there was such an error.

2024-09-01 15:38:41 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance HTTPConnectionPool(host='localhost', port=3000): Max retries exceeded with url: /api/org (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f30fba84b00>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-09-01 15:38:41 source=engine:app google_trace_id=none logger=root outbound latency=0.006080286999349482 status=503 method=HEAD url=http://localhost:3000/api/org slow=0

Even though the environment variable GRAFANA_API_URL.

self_hosted_install.py contains

class SelfHostedInstallView(GrafanaHeadersMixin, APIView):
    def post(self, _request: Request) -> Response:
        """
        We've already validated that settings.GRAFANA_API_URL is set (in apps.grafana_plugin.GrafanaPluginConfig)
        The user is now trying to finish plugin installation. We'll take the Grafana API url that they specified +
        the token that we are provided and first verify them. If all is good, upsert the organization in the database,
        and provision the plugin.
        """
        stack_id = settings.SELF_HOSTED_SETTINGS["STACK_ID"]
        org_id = settings.SELF_HOSTED_SETTINGS["ORG_ID"]
        grafana_url = settings.SELF_HOSTED_SETTINGS["GRAFANA_API_URL"]
        grafana_api_token = self.instance_context["grafana_token"]

       grafana_api_client = GrafanaAPIClient(api_url=grafana_url, api_token=grafana_api_token)

When I printed out to logs api_url in GrafanaAPICLient there was localhost:3000

I just hot-fixed it for myself, by hard-coding my Grafana api URL into code and re-installed the plugin.

krisht8485 commented 2 months ago

Thank You solution works . Also following update in backend table pointing to correct URL working.

UPDATE public.plugin_setting SET json_data = '{"grafanaUrl":"http://xxx.xx.xx.x:3000","license":"OpenSource","onCallApiUrl":"http://xxx.xx.xx.x:8080","orgId":100,"stackId":5}' WHERE id=2;

nordby commented 2 months ago

I got into a similar issue while I was trying to install the plugin. The error in the Grafana plugin configuration was Plugin is not connected and when I checked on-call engine logs there was such an error.

2024-09-01 15:38:41 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance HTTPConnectionPool(host='localhost', port=3000): Max retries exceeded with url: /api/org (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f30fba84b00>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-09-01 15:38:41 source=engine:app google_trace_id=none logger=root outbound latency=0.006080286999349482 status=503 method=HEAD url=http://localhost:3000/api/org slow=0

Even though the environment variable GRAFANA_API_URL.

self_hosted_install.py contains

class SelfHostedInstallView(GrafanaHeadersMixin, APIView): def post(self, _request: Request) -> Response: """ We've already validated that settings.GRAFANA_API_URL is set (in apps.grafana_plugin.GrafanaPluginConfig) The user is now trying to finish plugin installation. We'll take the Grafana API url that they specified + the token that we are provided and first verify them. If all is good, upsert the organization in the database, and provision the plugin. """ stack_id = settings.SELF_HOSTED_SETTINGS["STACK_ID"] org_id = settings.SELF_HOSTED_SETTINGS["ORG_ID"] grafana_url = settings.SELF_HOSTED_SETTINGS["GRAFANA_API_URL"] grafana_api_token = self.instance_context["grafana_token"]

   grafana_api_client = GrafanaAPIClient(api_url=grafana_url, api_token=grafana_api_token)

When I printed out to logs api_url in GrafanaAPICLient there was localhost:3000

I just hot-fixed it for myself, by hard-coding my Grafana api URL into code and re-installed the plugin.

I tried this option, the logs still show localhost, the problem remains

daasol commented 2 months ago

I am posting additional information. When accessing the Grafana OnCall plugin page, the status response obtained through developer mode is as follows:

{
    "pluginConnection": {
        "settings": {
            "ok": true
        },
        "service_account_token": {
            "ok": true
        },
        "grafana_url_from_plugin": {
            "ok": true
        },
        "grafana_url_from_engine": {
            "ok": false,
            "error": "Not validated"
        },
        "oncall_api_url": {
            "ok": false,
            "error": "Not validated"
        },
        "oncall_token": {
            "ok": false,
            "error": "Unauthorized/Forbidden while accessing OnCall engine: /api/internal/v1/plugin/v2/status, status code: 403, check token"
        }
    },
    "license": "",
    "version": "",
    "currently_undergoing_maintenance_message": "",
    "api_url": ""
}

Additionally, based on the previously mentioned logs:

2024-09-02 13:53:41 source=engine:app google_trace_id=none logger=django.request Internal Server Error: /api/internal/v1/plugin/v2/install
Traceback (most recent call last):
...
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type OnCallError is not JSON serializable
2024-09-02 13:53:41 source=engine:app google_trace_id=none logger=django.request Forbidden: /api/internal/v1/plugin/v2/status

It seems that the OnCallError object is not being serialized into JSON. Moreover, to resolve the 403 Forbidden error occurring during the request to /api/internal/v1/plugin/v2/status, I would like to check the permissions of the API key or service account. Could you please advise on how to verify if the values in the following part are correctly configured?

melquisedequecosta98 commented 2 months ago

Same problem here.

Oncal Helm Release "1.9.21"

SeanGaluzzi commented 2 months ago

We are encountering the same issue: unable to connect the OnCall Grafana plugin.

Error Description

When attempting to connect the OnCall plugin, I encounter the following error message:

Plugin is not connected: Unauthorized/Forbidden while accessing OnCall engine: /api/internal/v1/plugin/v2/status, status code: 403, check token.

Engine Log

2024-09-05 11:32:41 source=engine:app google_trace_id=none logger=root inbound latency=0.000621 status=200 method=GET path=/api/internal/v1/health/ user_agent=Go-http-client/1.1 content-length=0 slow=0
2024-09-05 11:32:41 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/health/ latency=0.002093 google_trace_id=- protocol=HTTP/1.1 resp_size=221 req_body_size=0
2024-09-05 11:32:41 source=engine:app google_trace_id=none logger=apps.social_auth.middlewares SocialAuthAuthCanceledExceptionMiddleware.process_exception: Object of type OnCallError is not JSON serializable
2024-09-05 11:32:41 source=engine:app google_trace_id=none logger=django.request Internal Server Error: /api/internal/v1/plugin/v2/install
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/base.py", line 220, in _get_response
    response = response.render()
               ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/template/response.py", line 114, in render
    self.content = self.rendered_content
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/response.py", line 74, in rendered_content
    ret = renderer.render(self.data, accepted_media_type, context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/renderers.py", line 100, in render
    ret = json.dumps(
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/json.py", line 25, in dumps
    return json.dumps(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/encoders.py", line 67, in default
    return super().default(obj)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type OnCallError is not JSON serializable

I conducted a series of experiments

It seems that the token generated by Grafana's Service Account (the one created by OnCall when you click "Re-create" in the configuration tab) needs to be added to the .env file. However, there is no way to obtain this token, as it is generated automatically without an option to copy it.

I tried deleting the token from the Service Account and creating one manually so that I could copy it and place it in the .env file. This way, the "Connect" button no longer displayed any errors, but the green checkmark did not appear for the second step of OnCall.

Only after clicking the "Re-create" button did the green checkmark show up, and the error 403 show up again

mderynck commented 2 months ago

Recently we made some changes to the way Grafana OnCall is initialized. Use 1.9.22, there were quite a few changes along the way from 1.9.0-1.9.22 to get things working.

achintya-7 commented 2 months ago

Recently we made some changes to the way Grafana OnCall is initialized. Use 1.9.22, there were quite a few changes along the way from 1.9.0-1.9.22 to get things working.

  • If you are running Grafana 11 and newer you must have externalServiceAccounts feature toggle enabled. This has already been enabled in the docker compose files and helm charts in the oncall repo.
  • Plugin settings must be provided to the plugin using an API call if you are installing for the first time (Note: credentials and hostnames need to be adjusted for your configuration, stackId and orgId are expected to be the listed constants in a self-hosted configuration)
curl -X POST 'http://admin:admin@localhost:3000/api/plugins/grafana-oncall-app/settings' -H "Content-Type: application/json" -d '{"enabled":true, "jsonData":{"stackId":5, "orgId":100, "onCallApiUrl":"http://engine:8080/", "grafanaUrl":"http://grafana:3000/"}}'
  • Once settings are configured use this API call to install:
curl -X POST 'http://admin:admin@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/install'

Grafana OnCall should now be ready to use. For additional troubleshooting see here

This worked for me for our self hosted Grafana OnCall via Docker Compose Thank you.

daasol commented 2 months ago

This worked for me (using helm). Thank you so much for your response.

However, I have one question.Is there a way to manage the settings I updated via commands as a file? I want to manage and deploy a file like grafana-oncall-app-provisioning.yaml using a Helm chart.

I tried to configure it in JSON format to make Grafana recognize the stackId and the Grafana oncall URL, but it didn’t work. I referred to the provisioning file in the oncall repo

datasources:
  grafana-oncall-app.yaml:
    apiVersion: 1
    apps:
      - type: grafana-oncall-app
        name: grafana-oncall-app vv1.9.19
        #access: proxy
        enabled: true
        jsonData:
          stackId: 5
          orgId: 1
          onCallApiUrl: https://grafana-oncall.xxxx.xx
          oncallUrl: https://grafana-oncall.xxxx.xx
          grafanaUrl: "https://grafana.xxxx.xx"

and mount like this

    volumeMounts:
      - name: config
        mountPath: "/etc/grafana/grafana.ini"
        subPath: grafana.ini
      - name: config
        mountPath: "/etc/grafana/provisioning/plugins/grafana-oncall-app.yaml"
        subPath: grafana-oncall-app.yaml

However, it did not seem to work as expected.

Would it be possible for you to update the guide document and provide detailed instructions on how to configure this?

daasol commented 2 months ago

We are also using LDAP in another Grafana instance that is self-hosted. However, despite having the same configuration across the board, a 403 error occurs, and we can't proceed any further. (gf version 11.1.0) A pop-up appears indicating that the plugin was updated successfully, and when I checked the database, everything seemed to be updated correctly.

Plugin is not connected

Image

After several attempts, I was able to successfully connect Grafana and OnCall.

Issue where the OnCall token wasn't being generated → Check the orgId I am using a self-hosted Grafana, and when installing the OnCall plugin, I set the orgId to 1, but it failed. (I thought this orgId was 1 because the main org in Grafana was set to 1.) So, I changed the orgId to 100. (What exactly does orgId mean?) I sent POST requests with orgId set to 1 and 100, and there was a difference in the response when I checked the logs.

[orgId 1 case]

grafana

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X POST 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/settings' \
-H "Content-Type: application/json" \
-d '{"enabled":true, "jsonData":{"stackId":5, "orgId":1, "onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/", "grafanaUrl":"http://grafana.grafana:80/"}}'

>>> Response: {"message":"Plugin settings updated"}

db

[orgId 100 case] grafana

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X POST 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/settings' \
-H "Content-Type: application/json" \
-d '{"enabled":true, "jsonData":{"stackId":5, "orgId":100, "onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/", "grafanaUrl":"http://grafana.grafana.svc.cluster.local:80/"}}'

>> {"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},"grafana_url_from_engine":{"ok":false,"error":"Not validated"},"oncall_api_url":{"ok":true},"oncall_token":{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

db

GRAFANA_API_URL

Additionally, I found the following logs from the engine:

Since Grafana and the engine are in different namespaces, they should communicate using the format {svc}.{namespace}:port. However, in the actual engine logs, it appeared as {svc}:port.

2024-09-08 12:55:24 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance HTTPConnectionPool(host='grafana', port=80): Max retries exceeded with url: /api/org (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f16d8fa4260>: Failed to establish a new connection: [Errno -5] Name has no usable address'))
2024-09-08 12:55:24 source=engine:app google_trace_id=none logger=root outbound latency=0.002581873908638954 status=503 method=HEAD url=http://grafana:80/api/org slow=0

Also, when I checked the job's YAML file, the Grafana URL was set as follows:

        - name: GRAFANA_API_URL
          value: http://grafana  # (not svc.namespace:port)

This value is templated through the following links: Link 1 Link 2

{{- define "snippet.grafana.url" -}}
{{ if .Values.grafana.enabled -}}
  http://{{ include "oncall.grafana.fullname" . }}
{{- else -}}
  {{ required "externalGrafana.url is required when not grafana.enabled" .Values.externalGrafana.url }}
{{- end }}
{{- end }}
...
{{- define "snippet.grafana.url" -}}
{{ if .Values.grafana.enabled -}}
  http://{{ include "oncall.grafana.fullname" . }}
{{- else -}}
  {{ required "externalGrafana.url is required when not grafana.enabled" .Values.externalGrafana.url }}
{{- end }}
{{- end }}

Since the intended Grafana URL was not applied, I enclosed the value in double quotes to set it properly. Example:

externalGrafana:
  # Example: https://grafana.mydomain.com
  url: "http://grafana.grafana:80"

When the Grafana OnCall engine has an incorrect Grafana URL setting.

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X GET 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/status'
{"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},

"grafana_url_from_engine":{"ok":false,"error":"Not validated"}, 

"oncall_api_url":{"ok":true},"oncall_token":
{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

When the correct Grafana URL is set for the Grafana OnCall engine and the migrate job.

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X GET 'http://temp-admin:xxxx@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/status'
{"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},

"grafana_url_from_engine":{"ok":true}, 

"oncall_api_url":{"ok":true},"oncall_token":{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

After this, the Grafana URL was recognized correctly, and communication began successfully.

atmaniak commented 2 months ago

I think we have an issue related to this one, after upgrading the oncall isn't connected anymore.

When trying to do the install from the documentation, we have this issue :

2024-09-09 15:33:30 source=engine:app google_trace_id=none logger=apps.social_auth.middlewares SocialAuthAuthCanceledExceptionMiddleware.process_exception: Object of type OnCallError is not JSON serializable
2024-09-09 15:33:30 source=engine:app google_trace_id=none logger=django.request Internal Server Error: /api/internal/v1/plugin/v2/install
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/base.py", line 220, in _get_response
    response = response.render()
               ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/template/response.py", line 114, in render
    self.content = self.rendered_content
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/response.py", line 74, in rendered_content
    ret = renderer.render(self.data, accepted_media_type, context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/renderers.py", line 100, in render
    ret = json.dumps(
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/json.py", line 25, in dumps
    return json.dumps(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/encoders.py", line 67, in default
    return super().default(obj)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type OnCallError is not JSON serializable
maffelbaffel commented 2 months ago

I think we have an issue related to this one, after upgrading the oncall isn't connected anymore.

When trying to do the install from the documentation, we have this issue :

2024-09-09 15:33:30 source=engine:app google_trace_id=none logger=apps.social_auth.middlewares SocialAuthAuthCanceledExceptionMiddleware.process_exception: Object of type OnCallError is not JSON serializable
2024-09-09 15:33:30 source=engine:app google_trace_id=none logger=django.request Internal Server Error: /api/internal/v1/plugin/v2/install
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/core/handlers/base.py", line 220, in _get_response
    response = response.render()
               ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/django/template/response.py", line 114, in render
    self.content = self.rendered_content
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/response.py", line 74, in rendered_content
    ret = renderer.render(self.data, accepted_media_type, context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/renderers.py", line 100, in render
    ret = json.dumps(
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/json.py", line 25, in dumps
    return json.dumps(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
          ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/rest_framework/utils/encoders.py", line 67, in default
    return super().default(obj)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type OnCallError is not JSON serializable

Also getting this error with plugin v1.9.25

chadhardcastle commented 2 months ago

We are also using LDAP in another Grafana instance that is self-hosted. However, despite having the same configuration across the board, a 403 error occurs, and we can't proceed any further. (gf version 11.1.0) A pop-up appears indicating that the plugin was updated successfully, and when I checked the database, everything seemed to be updated correctly.

Plugin is not connected

Image

After several attempts, I was able to successfully connect Grafana and OnCall.

Issue where the OnCall token wasn't being generated → Check the orgId I am using a self-hosted Grafana, and when installing the OnCall plugin, I set the orgId to 1, but it failed. (I thought this orgId was 1 because the main org in Grafana was set to 1.) So, I changed the orgId to 100. (What exactly does orgId mean?) I sent POST requests with orgId set to 1 and 100, and there was a difference in the response when I checked the logs.

[orgId 1 case]

grafana

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X POST 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/settings' \
-H "Content-Type: application/json" \
-d '{"enabled":true, "jsonData":{"stackId":5, "orgId":1, "onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/", "grafanaUrl":"http://grafana.grafana:80/"}}'

>>> Response: {"message":"Plugin settings updated"}

db

* the service account and Grafana token were generated, But the oncall token was not created. (only created grafanaToken)
grafana=> SELECT * FROM public.plugin_setting;
 id | org_id |     plugin_id      | enabled | pinned |                                                    json_data                                                     |                                                                        secure_json_data                                                                         |       created       |       updated       | plugin_version
----+--------+--------------------+---------+--------+------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+----------------
  3 |      1 | grafana-oncall-app | t       | t      | {"grafanaUrl":"https://grafana.xxxx.xx/","onCallApiUrl":"https://grafana-oncall.xxxx.xx/","orgId":1,"stackId":5} | {"grafanaToken":"I1lXUjROW..."} | 2024-09-07 09:57:20 | 2024-09-07 10:14:18 | v1.9.22
(1 row)

[orgId 100 case] grafana

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X POST 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/settings' \
-H "Content-Type: application/json" \
-d '{"enabled":true, "jsonData":{"stackId":5, "orgId":100, "onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/", "grafanaUrl":"http://grafana.grafana.svc.cluster.local:80/"}}'

>> {"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},"grafana_url_from_engine":{"ok":false,"error":"Not validated"},"oncall_api_url":{"ok":true},"oncall_token":{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

db

* now create onCallApiToken
grafana-> ;
 id | org_id |     plugin_id      | enabled | pinned |                                                                        json_data                                                                         |                                                                                                                                                                                                              secure_json_data                                                                                                                                                                                                               |       created       |       updated       | plugin_version

  3 |      1 | grafana-oncall-app | t       | f      | {"grafanaUrl":"http://grafana.grafana.svc.cluster.local:80/","onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/","orgId":100,"stackId":5} | {"grafanaToken":"I1pXUjRPW...","onCallApiToken":"I1pXUjR..."} | 2024-09-08 12:34:52 | 2024-09-08 12:55:11 | v1.9.22
(1 row)

GRAFANA_API_URL

Additionally, I found the following logs from the engine:

Since Grafana and the engine are in different namespaces, they should communicate using the format {svc}.{namespace}:port. However, in the actual engine logs, it appeared as {svc}:port.

2024-09-08 12:55:24 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance HTTPConnectionPool(host='grafana', port=80): Max retries exceeded with url: /api/org (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f16d8fa4260>: Failed to establish a new connection: [Errno -5] Name has no usable address'))
2024-09-08 12:55:24 source=engine:app google_trace_id=none logger=root outbound latency=0.002581873908638954 status=503 method=HEAD url=http://grafana:80/api/org slow=0

Also, when I checked the job's YAML file, the Grafana URL was set as follows:

        - name: GRAFANA_API_URL
          value: http://grafana  # (not svc.namespace:port)

This value is templated through the following links: Link 1 Link 2

{{- define "snippet.grafana.url" -}}
{{ if .Values.grafana.enabled -}}
  http://{{ include "oncall.grafana.fullname" . }}
{{- else -}}
  {{ required "externalGrafana.url is required when not grafana.enabled" .Values.externalGrafana.url }}
{{- end }}
{{- end }}
...
{{- define "snippet.grafana.url" -}}
{{ if .Values.grafana.enabled -}}
  http://{{ include "oncall.grafana.fullname" . }}
{{- else -}}
  {{ required "externalGrafana.url is required when not grafana.enabled" .Values.externalGrafana.url }}
{{- end }}
{{- end }}

Since the intended Grafana URL was not applied, I enclosed the value in double quotes to set it properly. Example:

externalGrafana:
  # Example: https://grafana.mydomain.com
  url: "http://grafana.grafana:80"

When the Grafana OnCall engine has an incorrect Grafana URL setting.

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X GET 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/status'
{"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},

"grafana_url_from_engine":{"ok":false,"error":"Not validated"}, 

"oncall_api_url":{"ok":true},"oncall_token":
{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

When the correct Grafana URL is set for the Grafana OnCall engine and the migrate job.

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X GET 'http://temp-admin:xxxx@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/status'
{"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},

"grafana_url_from_engine":{"ok":true}, 

"oncall_api_url":{"ok":true},"oncall_token":{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

After this, the Grafana URL was recognized correctly, and communication began successfully.

This resolved my issues! Thanks @daasol!

SeanGaluzzi commented 2 months ago

Hi @chadhardcastle, @daasol where did you run the following query? grafana=> SELECT * FROM public.plugin_setting; It looks like it might have been executed through the Grafana CLI, but I'm not entirely sure. Could you also share how you connected to the database to run this query? Thanks!

dinfdsooff commented 1 month ago

Hi @chadhardcastle, @daasol where did you run the following query? grafana=> SELECT * FROM public.plugin_setting; It looks like it might have been executed through the Grafana CLI, but I'm not entirely sure. Could you also share how you connected to the database to run this query? Thanks!

It is on Grafana Database. This issue is related https://github.com/grafana/oncall/issues/5096

dioniseo commented 1 month ago

Hello All, did anybody find a way to automate this step using the Grafana config files?

curl -X POST 'http://user:pwd@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/install'

I was able to automate the configuration of the plugin using the grafana_plugin reloader and configmap with the described paramters and they were successfully loaded to the plugin configuraton, but without this last request, the plugin is not trying to link with the Grafana OnCall. It woudl be great if there is something in config files that coudl help to fully automate this last step without use of intermediate api reqeusts.

SeanGaluzzi commented 1 month ago

Hi @dinfdsooff,

I couldn't find any documentation on connecting to the Grafana database. Could you please provide the steps to access the database and query the data?

Thanks in advance!

Mdumala commented 1 month ago

We are also using LDAP in another Grafana instance that is self-hosted. However, despite having the same configuration across the board, a 403 error occurs, and we can't proceed any further. (gf version 11.1.0) A pop-up appears indicating that the plugin was updated successfully, and when I checked the database, everything seemed to be updated correctly.

Plugin is not connected

Image

After several attempts, I was able to successfully connect Grafana and OnCall.

Issue where the OnCall token wasn't being generated → Check the orgId I am using a self-hosted Grafana, and when installing the OnCall plugin, I set the orgId to 1, but it failed. (I thought this orgId was 1 because the main org in Grafana was set to 1.) So, I changed the orgId to 100. (What exactly does orgId mean?) I sent POST requests with orgId set to 1 and 100, and there was a difference in the response when I checked the logs.

[orgId 1 case]

grafana

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X POST 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/settings' \
-H "Content-Type: application/json" \
-d '{"enabled":true, "jsonData":{"stackId":5, "orgId":1, "onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/", "grafanaUrl":"http://grafana.grafana:80/"}}'

>>> Response: {"message":"Plugin settings updated"}

db

  • the service account and Grafana token were generated, But the oncall token was not created. (only created grafanaToken)
grafana=> SELECT * FROM public.plugin_setting;
 id | org_id |     plugin_id      | enabled | pinned |                                                    json_data                                                     |                                                                        secure_json_data                                                                         |       created       |       updated       | plugin_version
----+--------+--------------------+---------+--------+------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------------------+----------------
  3 |      1 | grafana-oncall-app | t       | t      | {"grafanaUrl":"https://grafana.xxxx.xx/","onCallApiUrl":"https://grafana-oncall.xxxx.xx/","orgId":1,"stackId":5} | {"grafanaToken":"I1lXUjROW..."} | 2024-09-07 09:57:20 | 2024-09-07 10:14:18 | v1.9.22
(1 row)

[orgId 100 case] grafana

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X POST 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/settings' \
-H "Content-Type: application/json" \
-d '{"enabled":true, "jsonData":{"stackId":5, "orgId":100, "onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/", "grafanaUrl":"http://grafana.grafana.svc.cluster.local:80/"}}'

>> {"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},"grafana_url_from_engine":{"ok":false,"error":"Not validated"},"oncall_api_url":{"ok":true},"oncall_token":{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

db

  • now create onCallApiToken
grafana-> ;
 id | org_id |     plugin_id      | enabled | pinned |                                                                        json_data                                                                         |                                                                                                                                                                                                              secure_json_data                                                                                                                                                                                                               |       created       |       updated       | plugin_version

  3 |      1 | grafana-oncall-app | t       | f      | {"grafanaUrl":"http://grafana.grafana.svc.cluster.local:80/","onCallApiUrl":"http://grafana-oncall-engine.grafana-oncall:8080/","orgId":100,"stackId":5} | {"grafanaToken":"I1pXUjRPW...","onCallApiToken":"I1pXUjR..."} | 2024-09-08 12:34:52 | 2024-09-08 12:55:11 | v1.9.22
(1 row)

GRAFANA_API_URL

Additionally, I found the following logs from the engine:

Since Grafana and the engine are in different namespaces, they should communicate using the format {svc}.{namespace}:port. However, in the actual engine logs, it appeared as {svc}:port.

2024-09-08 12:55:24 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance HTTPConnectionPool(host='grafana', port=80): Max retries exceeded with url: /api/org (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f16d8fa4260>: Failed to establish a new connection: [Errno -5] Name has no usable address'))
2024-09-08 12:55:24 source=engine:app google_trace_id=none logger=root outbound latency=0.002581873908638954 status=503 method=HEAD url=http://grafana:80/api/org slow=0

Also, when I checked the job's YAML file, the Grafana URL was set as follows:

        - name: GRAFANA_API_URL
          value: http://grafana  # (not svc.namespace:port)

This value is templated through the following links: Link 1 Link 2

{{- define "snippet.grafana.url" -}}
{{ if .Values.grafana.enabled -}}
  http://{{ include "oncall.grafana.fullname" . }}
{{- else -}}
  {{ required "externalGrafana.url is required when not grafana.enabled" .Values.externalGrafana.url }}
{{- end }}
{{- end }}
...
{{- define "snippet.grafana.url" -}}
{{ if .Values.grafana.enabled -}}
  http://{{ include "oncall.grafana.fullname" . }}
{{- else -}}
  {{ required "externalGrafana.url is required when not grafana.enabled" .Values.externalGrafana.url }}
{{- end }}
{{- end }}

Since the intended Grafana URL was not applied, I enclosed the value in double quotes to set it properly. Example:

externalGrafana:
  # Example: https://grafana.mydomain.com
  url: "http://grafana.grafana:80"

When the Grafana OnCall engine has an incorrect Grafana URL setting.

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X GET 'http://temp-admin:xxx@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/status'
{"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},

"grafana_url_from_engine":{"ok":false,"error":"Not validated"}, 

"oncall_api_url":{"ok":true},"oncall_token":
{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

When the correct Grafana URL is set for the Grafana OnCall engine and the migrate job.

grafana-85bdf5484b-bpvkw:/usr/share/grafana# curl -X GET 'http://temp-admin:xxxx@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/status'
{"pluginConnection":{"settings":{"ok":true},"service_account_token":{"ok":true},"grafana_url_from_plugin":{"ok":true},

"grafana_url_from_engine":{"ok":true}, 

"oncall_api_url":{"ok":true},"oncall_token":{"ok":true}},"license":"OpenSource","version":"v1.9.21","currently_undergoing_maintenance_message":"","api_url":"https://grafana-oncall.xxxx.xx/"}

After this, the Grafana URL was recognized correctly, and communication began successfully.

I'm trying to set up a connection with grafana and grafana-oncall.

What i see from your post that after i put settings to plugin by curl i have empty column secure_json_data (no tokens) so its clear why i got 403 forbidden. Do you have any clue why its not created?