grafana / oncall

Developer-friendly incident response with brilliant Slack integration
GNU Affero General Public License v3.0
3.42k stars 274 forks source link

External Grafana couldn't connect to grafana oncall. #1045

Open OlegSupport opened 1 year ago

OlegSupport commented 1 year ago

In grafana instance in eks, when plugin try to connect to OnCall backend i'm seeing this error:

logger=context t=2022-12-26T14:53:22.443545292Z level=error msg="invalid API key" error="invalid API key" traceID=

logger=context userId=0 orgId=0 uname= t=2022-12-26T14:53:22.44362346Z level=info msg="Request Completed" method=HEAD path=/api/access-control/users/permissions/search status=401 remote_addr=xx.xxx.xxx.xx time_ms=0 duration=91.524µs size=0 referer=

logger=context t=2022-12-26T14:53:22.475670414Z level=error msg="invalid API key" error="invalid API key" traceID=

logger=context userId=0 orgId=0 uname= t=2022-12-26T14:53:22.475747064Z level=info msg="Request Completed" method=GET path=/api/org/users status=401 remote_addr=xx.xxx.xxx.xx time_ms=0 duration=88.058µs size=43 referer=

Logs at the same time from oncall-engine:

2022-12-26 14:54:45 source=engine:app google_trace_id=none logger=root inbound latency=0.00061 status=200 method=GET path=/health/ content-length=0 slow=0 
2022-12-26 14:54:45 source=engine:app google_trace_id=none logger=root inbound latency=0.00051 status=200 method=GET path=/ready/ content-length=0 slow=0 
2022-12-26 14:54:45 source=engine:uwsgi status=200 method=GET path=/health/ latency=0.002718 google_trace_id=- protocol=HTTP/1.1 resp_size=180 req_body_size=0
2022-12-26 14:54:45 source=engine:uwsgi status=200 method=GET path=/ready/ latency=0.001661 google_trace_id=- protocol=HTTP/1.1 resp_size=180 req_body_size=0
2022-12-26 14:55:21 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance Expecting value: line 1 column 1 (char 0)
2022-12-26 14:55:21 source=engine:app google_trace_id=none logger=root outbound latency=130.58199034398422 status=200 method=HEAD url=https://dns-name.xxx/api/org slow=1 
2022-12-26 14:55:21 source=engine:app google_trace_id=none logger=root inbound latency=130.589876 status=200 method=GET path=/api/internal/v1/plugin/status content-length=0 slow=1 
2022-12-26 14:55:21 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/plugin/status latency=130.591187 google_trace_id=- protocol=HTTP/1.1 resp_size=367 req_body_size=0

Logs from oncall-celery:

2022-12-26 14:51:11,323 source=engine:celery worker=ForkPoolWorker-2 task_id=f499dec5-bda0-4877-bf8b-5c67d5b4ad8c task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=apps.grafana_plugin.tasks.sync level=INFO Start sync Organization 1
2022-12-26 14:53:22,447 source=engine:celery worker=ForkPoolWorker-2 task_id=f499dec5-bda0-4877-bf8b-5c67d5b4ad8c task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=apps.grafana_plugin.helpers.client level=WARNING Error connecting to api instance 401 Client Error: Unauthorized for url: https://dns-name.xxx/api/access-control/users/permissions/search?actionPrefix=grafana-oncall-app
2022-12-26 14:53:22,447 source=engine:celery worker=ForkPoolWorker-2 task_id=f499dec5-bda0-4877-bf8b-5c67d5b4ad8c task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=root level=INFO outbound latency=131.1155290780589 status=401 method=HEAD url=https://dns-name.xxx/api/access-control/users/permissions/search?actionPrefix=grafana-oncall-app slow=1 
2022-12-26 14:53:22,478 source=engine:celery worker=ForkPoolWorker-2 task_id=f499dec5-bda0-4877-bf8b-5c67d5b4ad8c task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=apps.grafana_plugin.helpers.client level=WARNING Error connecting to api instance 401 Client Error: Unauthorized for url: https://dns-name.xxx/api/org/users
2022-12-26 14:53:22,478 source=engine:celery worker=ForkPoolWorker-2 task_id=f499dec5-bda0-4877-bf8b-5c67d5b4ad8c task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=root level=INFO outbound latency=0.030806066002696753 status=401 method=GET url=https://dns-name.xxx/api/org/users slow=0 
2022-12-26 14:53:22,482 source=engine:celery worker=ForkPoolWorker-2 task_id=f499dec5-bda0-4877-bf8b-5c67d5b4ad8c task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=apps.grafana_plugin.tasks.sync level=INFO Finish sync Organization 1
2022-12-26 14:53:22,482 source=engine:celery worker=ForkPoolWorker-2 task_id=f499dec5-bda0-4877-bf8b-5c67d5b4ad8c task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=celery.app.trace level=INFO Task apps.grafana_plugin.tasks.sync.plugin_sync_organization_async[f499dec5-bda0-4877-bf8b-5c67d5b4ad8c] succeeded in 131.15941406204365s: None
OlegSupport commented 1 year ago

could someone help me? oncall doesn't work with external grafana 9.3.1

tmpm697 commented 1 year ago

same issue: https://github.com/grafana/oncall/issues/1035

OlegSupport commented 1 year ago

not the same. i'm using grafana oncall in kubernetes EKS 1.24. not in docker-compose

gilbertobr commented 1 year ago

I am also not able to communicate with external oncall I'm using GKE, and I get You are not authorized to communicate with the specified Grafana API - http://oncall-1672668006-grafana

image

Log error

2023-01-02 21:28:58 source=engine:uwsgi status=400 method=POST path=/api/internal/v1/plugin/self-hosted/install latency=0.028631 google_trace_id=- protocol=HTTP/1.1 resp_size=361 req_body_size=0
2023-01-02 21:29:42 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance 401 Client Error: Unauthorized for url: http://oncall-1672668006-grafana/api/org
2023-01-02 21:29:42 source=engine:app google_trace_id=none logger=root outbound latency=0.037018798997451086 status=401 method=HEAD url=http://oncall-1672668006-grafana/api/org slow=0 
2023-01-02 21:29:42 source=engine:app google_trace_id=none logger=root inbound latency=0.054978 status=400 method=POST path=/api/internal/v1/plugin/self-hosted/install content-length=0 slow=0 
2023-01-02 21:29:42 source=engine:app google_trace_id=none logger=django.request Bad Request: /api/internal/v1/plugin/self-hosted/install
2023-01-02 21:29:42 source=engine:uwsgi status=400 method=POST path=/api/internal/v1/plugin/self-hosted/install latency=0.056754 google_trace_id=- protocol=HTTP/1.1 resp_size=361 req_body_size=0
2023-01-02 21:29:45 source=engine:app google_trace_id=none logger=root inbound latency=0.000661 status=200 method=GET path=/ready/ content-length=0 slow=0 
2023-01-02 21:29:45 source=engine:uwsgi status=200 method=GET path=/ready/ latency=0.001720 google_trace_id=- protocol=HTTP/1.1 resp_size=180 req_body_size=0
2023-01-02 21:29:45 source=engine:app google_trace_id=none logger=root inbound latency=0.000804 status=200 method=GET path=/health/ content-length=0 slow=0 
2023-01-02 21:29:45 source=engine:uwsgi status=200 method=GET path=/health/ latency=0.002308 google_trace_id=- protocol=HTTP/1.1 resp_size=180 req_body_size=0
2023-01-02 21:30:07 source=engine:app google_trace_id=none logger=apps.grafana_plugin.helpers.client Error connecting to api instance 401 Client Error: Unauthorized for url: http://oncall-1672668006-grafana/api/org
gilbertobr commented 1 year ago

I noticed that there is a variable called "externalGrafana" I left it that way

externalGrafana:
   url: grafana.mydomain.com.br

and I disabled grafana

grafana:
  enabled: false

Came to change the message:

An unknown error occured when trying to install the plugin. Are you sure that your OnCall API URL, https://oncall.mydomain.com.br, is correct?
Refresh your page and try again, or try removing your plugin configuration and reconfiguring.

Below is the error log in the Pod oncall-engine

2023-01-02 22:19:16 source=engine:uwsgi status=404 method=GET path=/grafana/api/live/ws latency=0.002917 google_trace_id=- protocol=HTTP/1.1 resp_size=394 req_body_size=0
2023-01-02 22:19:25 source=engine:app google_trace_id=none logger=root outbound latency=0.0004296669940231368 status=503 method=HEAD url=api/org slow=0 
2023-01-02 22:19:25 source=engine:app google_trace_id=none logger=django.request Internal Server Error: /api/internal/v1/plugin/self-hosted/install
....
....
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 573, in request
    prep = self.prepare_request(req)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 484, in prepare_request
    p.prepare(
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 368, in prepare
    self.prepare_url(url, params)
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 439, in prepare_url
    raise MissingSchema(
requests.exceptions.MissingSchema: Invalid URL 'api/org': No scheme supplied. Perhaps you meant http://api/org?
2023-01-02 22:19:25 source=engine:app google_trace_id=none logger=root inbound latency=0.020012 status=500 method=POST path=/api/internal/v1/plugin/self-hosted/install content-length=0 slow=0 
2023-01-02 22:19:25 source=engine:uwsgi status=500 method=POST path=/api/internal/v1/plugin/self-hosted/install latency=0.021294 google_trace_id=- protocol=HTTP/1.1 resp_size=372 req_body_size=0
gilbertobr commented 1 year ago

Another test done was to add https to externalGrafana:

Changed the error again:

logger=plugindashboards t=2023-01-02T22:42:18.810158235Z level=info msg="Plugin state changed" pluginId=grafana-oncall-app enabled=true
logger=plugindashboards t=2023-01-02T22:42:18.810234443Z level=info msg="Syncing plugin dashboards to DB" pluginId=grafana-oncall-app
logger=context userId=139 orgId=1 uname=gilberto@mydomain.com.br t=2023-01-02T22:42:32.811461327Z level=error msg="Request Completed" method=POST path=/api/plugin-proxy/grafana-oncall-app/api/internal/v1/plugin/self-hosted/install status=504 remote_addr=10.100.10.41 time_ms=60228 duration=1m0.228374429s size=562 referer=https://grafana.mydomain.com.br/plugins/grafana-oncall-app handler=/api/plugin-proxy/:pluginId/*
logger=ngalert.sender.router rule_uid=mAK1_S4Vz org_id=1 t=2023-01-02T22:42:47.327611364Z level=info msg="Sending alerts to local notifier" count=1
arousseau-coveo commented 1 year ago

I am also experiencing something similar:


2023-01-06 19:21:44 source=engine:app google_trace_id=none logger=root outbound latency=0.002253204999760783 status=503 method=HEAD url=api/org slow=0 
2023-01-06 19:21:44 source=engine:app google_trace_id=none logger=django.request Internal Server Error: /api/internal/v1/plugin/self-hosted/install
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
    return view_func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/views/generic/base.py", line 70, in view
    return self.dispatch(request, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 509, in dispatch
    response = self.handle_exception(exc)
  File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 469, in handle_exception
    self.raise_uncaught_exception(exc)
  File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
    raise exc
  File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 506, in dispatch
    response = handler(request, *args, **kwargs)
  File "/etc/app/apps/grafana_plugin/views/self_hosted_install.py", line 33, in post
    _, client_status = grafana_api_client.check_token()
  File "/etc/app/apps/grafana_plugin/helpers/client.py", line 79, in check_token
    return self.api_head("api/org")
  File "/etc/app/apps/grafana_plugin/helpers/client.py", line 27, in api_head
    return self.call_api(endpoint, requests.head, body)
  File "/etc/app/apps/grafana_plugin/helpers/client.py", line 38, in call_api
    response = http_method(call_status["url"], json=body, headers=self.request_headers)
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 100, in head
    return request("head", url, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 573, in request
    prep = self.prepare_request(req)
  File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 484, in prepare_request
    p.prepare(
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 368, in prepare
    self.prepare_url(url, params)
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 439, in prepare_url
    raise MissingSchema(
requests.exceptions.MissingSchema: Invalid URL 'api/org': No scheme supplied. Perhaps you meant http://api/org?
pikilisaikiran commented 1 year ago

Just restarting External grafana pods solved my issue

masikrus commented 1 year ago

i have another error

2023-02-14 20:47:43 source=engine:app google_trace_id=none logger=root inbound latency=0.014148 status=200 method=GET path=/api/internal/v1/cloud_connection content-length=0 slow=0 user_id=2 org_id=1
2023-02-14 20:47:43 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/cloud_connection latency=0.015668 google_trace_id=- protocol=HTTP/1.1 resp_size=365 req_body_size=0
2023-02-14 20:47:44 source=engine:app google_trace_id=none logger=root inbound latency=0.010535 status=200 method=GET path=/api/internal/v1/live_settings content-length=0 slow=0 user_id=2 org_id=1
2023-02-14 20:47:44 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/live_settings?search= latency=0.011650 google_trace_id=- protocol=HTTP/1.1 resp_size=6965 req_body_size=0
2023-02-14 20:47:44 source=engine:app google_trace_id=none logger=apps.oss_installation.models.cloud_connector Unable to sync with cloud. GRAFANA_CLOUD_ONCALL_TOKEN is invalid
2023-02-14 20:47:44 source=engine:app google_trace_id=none logger=root inbound latency=0.186894 status=200 method=PUT path=/api/internal/v1/live_settings/LAGFHWZMMKYZM content-length=112 slow=0 user_id=2 org_id=1
2023-02-14 20:47:44 source=engine:uwsgi status=200 method=PUT path=/api/internal/v1/live_settings/LAGFHWZMMKYZM?sync_users=false latency=0.188640 google_trace_id=- protocol=HTTP/1.1 resp_size=424 req_body_size=112
2023-02-14 20:47:44 source=engine:app google_trace_id=none logger=root inbound latency=0.011728 status=200 method=GET path=/api/internal/v1/current_team content-length=0 slow=0 user_id=2 org_id=1
2023-02-14 20:47:44 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/current_team latency=0.012912 google_trace_id=- protocol=HTTP/1.1 resp_size=770 req_body_size=0
2023-02-14 20:47:44 source=engine:app google_trace_id=none logger=root inbound latency=0.089292 status=200 method=GET path=/api/internal/v1/cloud_connection content-length=0 slow=0 user_id=2 org_id=1
2023-02-14 20:47:44 source=engine:uwsgi status=200 method=GET path=/api/internal/v1/cloud_connection latency=0.090626 google_trace_id=- protocol=HTTP/1.1 resp_size=365 req_body_size=0

GRAFANA_CLOUD_ONCALL_TOKEN is invalid I have my local grafana with local oncall, telegram work fine, url by oncall open on world with https, API Key i created in https://blablabla.grafana.net/a/grafana-oncall-app/settings what i do wrong????

masikrus commented 1 year ago

OMFG Grafana please update Documemtation..... if you got this error just add env GRAFANA_CLOUD_ONCALL_API_URL=$From_Cloud_OnCall_Url

justlstn commented 1 year ago

OMFG Grafana please update Documemtation..... if you got this error just add env GRAFANA_CLOUD_ONCALL_API_URL=$From_Cloud_OnCall_Url

Thanks for the solution. I was unable to connect OSS OnCall with Cloud OnCall until I set the GRAFANA_CLOUD_ONCALL_API_URL env.

Siebjee commented 1 year ago

I had the same issue where oncall was telling me "No scheme supplied". Adding the scheme http:// in the externalGrafana for me as the fix.

    externalGrafana:
      url: http://grafana.grafana.svc.cluster.local:80
masikrus commented 1 year ago

I had the same issue where oncall was telling me "No scheme supplied". Adding the scheme http:// in the externalGrafana for me as the fix.

    externalGrafana:
      url: http://grafana.grafana.svc.cluster.local:80

you need to external url be visible to the entire internet

use nginx proxy or alternative or port forwarding

and https necessarily

Siebjee commented 1 year ago

I had the same issue where oncall was telling me "No scheme supplied". Adding the scheme http:// in the externalGrafana for me as the fix.

    externalGrafana:
      url: http://grafana.grafana.svc.cluster.local:80

you need to external url be visible to the entire internet

use nginx proxy or alternative or port forwarding

and https necessarily

Not entirely true. My OnCall and Grafana are running inside kubernetes. They can access each other without making use of the public internet (which would be bad practice in kubernetes).

masikrus commented 1 year ago

For grafana cloud connect you need public access

akashcldcvr commented 1 year ago

In on-call helm values.yaml :

Not working :

externalGrafana:
  url: https://grafana.monitoring.com

update with internal k8s fqdn & it work

externalGrafana:
  url: http://grafana.monitoring.svc.cluster.local:3000

Once update with internal k8s fqdn from grafana plugin able to connect the Oncall engine backend

masikrus commented 1 year ago

In on-call helm values.yaml :

Not working :

externalGrafana:
 url: http://grafana.monitoring.com

update with internal k8s fqdn & it work

externalGrafana:
  url: http://grafana.monitoring.svc.cluster.local:3000

Once update with internal k8s fqdn from grafana plugin able to connect the Oncall engine backend

Show you cluster name kubectl exec -it $POD_NAME -- cat /etc/resolv.conf

by default svc grafana called kube-prometheus-grafana

so you url http://kube-prometheus-grafana.monitoring.svc.cluster.local:80

just look kubectl -n monitoring get svc