grafana / oncall

Developer-friendly incident response with brilliant Slack integration
GNU Affero General Public License v3.0
3.53k stars 292 forks source link

django.db.utils.OperationalError: server closed the connection unexpectedly #1409

Open muhammetbozkurt opened 1 year ago

muhammetbozkurt commented 1 year ago

Hi,

I tried to install oncall via helm Chart ( version 1.1.29) with following values.yaml :

base_url: my_base_url
grafana:
  enabled: false
externalGrafana:
  url: my_grafana_url
ingress-nginx:
  enabled: false
cert-manager:
  enabled: false
database:
  type: postgresql
mariadb:
  enabled: false
postgresql:
  enabled: true

However, it seems stuck while running "python manage.py migrate --check". oncall-engine and oncall-celery pods cannot connect to postgresql pod but oncall-migrate job can. I could not find why they show different behaviours.

oncall-engine and oncall-celery wait-for-db init container logs:

Waiting for database migrations
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
    self.connect()
  File "/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py", line 200, in connect
    self.connection = self.get_new_connection(conn_params)
  File "/usr/local/lib/python3.9/site-packages/django/utils/asyncio.py", line 33, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
    connection = Database.connect(**conn_params)
  File "/usr/local/lib/python3.9/site-packages/psycopg2/__init__.py", line 122, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

Waiting for database migrations migrate job log: oncall-postgresql.kube-monitoring.svc.cluster.local [172.20.198.142] 5432 (postgresql) open Operations to perform: Apply all migrations: admin, alerts, auth, auth_token, base, contenttypes, email, fcm_django, heartbeat, mobile_app, oss_installation, schedules, sessions, slack, social_django, telegram, twilioapp, user_management Running migrations: Applying contenttypes.0001_initial... OK Applying auth.0001_initial... OK Applying admin.0001_initial... OK Applying admin.0002_logentry_remove_auto_add... OK Applying admin.0003_logentry_add_action_flag_choices... OK Applying alerts.0001_squashed_initial... OK Applying slack.0001_squashed_initial... OK Applying user_management.0001_squashed_initial... OK Applying telegram.0001_squashed_initial... OK Applying slack.0002_squashed_initial... OK Applying schedules.0001_squashed_initial... OK Applying alerts.0002_squashed_initial... OK Applying alerts.0003_grafanaalertingcontactpoint_datasource_uid... OK Applying alerts.0004_auto_20220711_1106... OK Applying alerts.0005_alertgroup_cached_render_for_web... OK Applying alerts.0006_alertgroup_alerts_aler_channel_ee84a7_idx... OK Applying alerts.0007_populate_web_title_cache... OK Applying alerts.0008_alter_alertgrouplogrecord_type... OK Applying alerts.0009_alertreceivechannel_web_templates_modified_at... OK Applying contenttypes.0002_remove_content_type_name... OK Applying auth.0002_alter_permission_name_max_length... OK Applying auth.0003_alter_user_email_max_length... OK Applying auth.0004_alter_user_username_opts... OK Applying auth.0005_alter_user_last_login_null... OK Applying auth.0006_require_contenttypes_0002... OK Applying auth.0007_alter_validators_add_error_messages... OK Applying auth.0008_alter_user_username_max_length... OK Applying auth.0009_alter_user_last_name_max_length... OK Applying auth.0010_alter_group_name_max_length... OK Applying auth.0011_update_proxy_permissions... OK Applying auth.0012_alter_user_first_name_max_length... OK Applying auth_token.0001_squashed_initial... OK Applying auth_token.0002_squashed_initial... OK Applying auth_token.0003_auto_20221121_1610... OK Applying base.0001_squashed_initial... OK Applying base.0002_squashed_initial... OK Applying base.0003_delete_organizationlogrecord... OK Applying user_management.0002_auto_20220705_1214... OK Applying user_management.0003_user_hide_phone_number... OK Applying email.0001_initial... OK Applying fcm_django.0001_initial... OK Applying fcm_django.0002_auto_20160808_1645... OK Applying fcm_django.0003_auto_20170313_1314... OK Applying fcm_django.0004_auto_20181128_1642... OK Applying fcm_django.0005_auto_20170808_1145... OK Applying fcm_django.0006_auto_20210802_1140... OK Applying fcm_django.0007_auto_20211001_1440... OK Applying fcm_django.0008_auto_20211224_1205... OK Applying fcm_django.0009_alter_fcmdevice_user... OK Applying heartbeat.0001_squashed_initial... OK Applying user_management.0004_auto_20221025_0316... OK Applying mobile_app.0001_initial... OK Applying oss_installation.0001_squashed_initial... OK Applying schedules.0002_squashed_initial... OK Applying schedules.0003_alter_customoncallshift_frequency... OK Applying schedules.0004_customoncallshift_until... OK Applying schedules.0005_auto_20220704_1947... OK Applying schedules.0006_customoncallshift_rotation_start... OK Applying schedules.0007_customoncallshift_updated_shift... OK Applying schedules.0008_auto_20221201_0809... OK Applying sessions.0001_initial... OK Applying social_django.0001_initial... OK Applying social_django.0002_add_related_name... OK Applying social_django.0003_alter_email_max_length... OK Applying social_django.0004_auto_20160423_0400... OK Applying social_django.0005_auto_20160727_2333... OK Applying social_django.0006_partial... OK Applying social_django.0007_code_timestamp... OK Applying social_django.0008_partial_timestamp... OK Applying telegram.0002_alter_telegrammessage_message_type... OK Applying twilioapp.0001_squashed_initial... OK Applying twilioapp.0002_auto_20220604_1008... OK Applying user_management.0005_rbac_permissions... OK Applying user_management.0006_organization_uuid... OK Applying user_management.0007_organization_deleted_at... OK Applying user_management.0008_organization_is_grafana_incident_enabled... OK

Shelestov7 commented 1 year ago

Hi muhammetbozkurt did u find how to solve the problem? Cos i have a same.

batazor commented 1 year ago

Yes, this problem is still being observed