linkedin / iris

Iris is a highly configurable and flexible service for paging and messaging.
http://iris.claims
BSD 2-Clause "Simplified" License
810 stars 139 forks source link

Applications are not properly loaded when using uwsgi #723

Open roock opened 2 years ago

roock commented 2 years ago

When running iris in combination with uwsgi (e.g with the container image), the applications are not properly loaded at startup, leading to 401 errrors when using the webhooks api. Once the applications are fully loaded (e.g. by open iris ui with a web browser), the authentication error goes away without modifying the api call.

Steps to reproduce:

Some additional logging information: I've modified https://github.com/linkedin/iris/blob/master/src/iris/cache.py#L52 to print when the applications are loaded.

uwsgi error log:

[2022-06-13 13:13:38 +0000] [10] [INFO] iris.cache Loaded applications: Autoalerts, test-app, iris, oncall
[2022-06-13 13:13:38 +0000] [20] [INFO] iris.cache Loaded applications: Autoalerts, test-app, iris, oncall

uwsgi access log:

13/Jun/2022:13:11:54 +0000 [401] POST /v0/webhooks/alertmanager?application=oncall&key=magic 172.25.0.1 [curl/7.83.1] RT:0 REF:- SZ:248 HTTP/1.1
13/Jun/2022:13:11:57 +0000 [401] POST /v0/webhooks/alertmanager?application=oncall&key=magic 172.25.0.1 [curl/7.83.1] RT:0 REF:- SZ:248 HTTP/1.1
13/Jun/2022:13:13:38 +0000 [302] GET / 172.25.0.1 [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:101.0) Gecko/20100101 Firefox/101.0] RT:36 REF:- SZ:583 HTTP/1.1
13/Jun/2022:13:13:38 +0000 [200] GET /v0/applications 127.0.0.1 [python-requests/2.28.0] RT:42 REF:- SZ:3668 HTTP/1.1
13/Jun/2022:13:13:38 +0000 [200] GET /incidents 172.25.0.1 [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:101.0) Gecko/20100101 Firefox/101.0] RT:1729 REF:- SZ:15859 HTTP/1.1
13/Jun/2022:13:14:00 +0000 [201] POST /v0/webhooks/alertmanager?application=oncall&key=magic 172.25.0.1 [curl/7.83.1] RT:24 REF:- SZ:195 HTTP/1.1

This looks like the application cache is not properly populated on startup of the application.

I wasn't able to reproduce the issue with a local setup with gunicorn.

Used Sample Data

{
  "version": "4",
  "groupKey": "{}:{alertname=\"high_memory_load\"}",
  "status": "firing",
  "receiver": "teams_proxy",
  "groupLabels": {
      "alertname": "high_memory_load",
      "iris_plan": "Oncall test"
 },
  "commonLabels": {
      "alertname": "high_memory_load",
      "monitor": "master",
      "severity": "warning"
  },
  "commonAnnotations": {
      "summary": "Server High Memory usage"
  },
  "externalURL": "http://docker.for.mac.host.internal:9093",
  "alerts": [
      {
          "labels": {
              "alertname": "high_memory_load",
              "instance": "10.80.40.11:9100",
              "job": "docker_nodes",
              "monitor": "master",
              "severity": "warning"
          },
          "annotations": {
              "description": "10.80.40.11 reported high memory usage with 23.28%.",
              "summary": "Server High Memory usage"
          },
          "startsAt": "2018-03-07T06:33:21.873077559-05:00",
          "endsAt": "0001-01-01T00:00:00Z"
      }
  ]
}