Open jdrew82 opened 4 months ago
Spent a couple of hours testing this. We need also to change the way readinessProbe
is configured prior to remove the http
directive from uwsgi.ini
fle. Otherwise, Nautobot's state will never become ready even though the uwsgi socket is running.
I am trying to install nautobot in on prem k8s. and running into issue with the basic installtion.
nautobot-default pods are failing readiness probe
$ kubectl -n stargate get all NAME READY STATUS RESTARTS AGE pod/nautobot-dev-celery-beat-55d8596f54-7f4rt 1/1 Running 4 (31m ago) 32m pod/nautobot-dev-celery-default-7b9c967dd8-r8ll8 1/1 Running 2 (32m ago) 32m pod/nautobot-dev-celery-default-7b9c967dd8-x794q 1/1 Running 2 (32m ago) 32m pod/nautobot-dev-default-5f95f8c5cd-8mkwv 0/1 Running 0 15m pod/nautobot-dev-default-5f95f8c5cd-s7jvz 0/1 Running 0 15m pod/nautobot-dev-postgresql-0 1/1 Running 0 32m pod/nautobot-dev-redis-master-0 1/1 Running 0 32m
$ kubectl -n stargate logs -f deployment.apps/nautobot-dev-default
Found 2 pods, using pod/nautobot-dev-default-5f95f8c5cd-8mkwv
Defaulted container "nautobot" out of: nautobot, nautobot-init (init)
[uWSGI] getting INI configuration from /opt/nautobot/uwsgi.ini
[uwsgi-static] added mapping for /static => /opt/nautobot/static
*** Starting uWSGI 2.0.23 (64bit) on [Fri Aug 2 18:33:08 2024] ***
compiled with version: 12.2.0 on 08 July 2024 22:01:50
os: Linux-5.14.0-427.24.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Jun 23 11:48:35 EDT 2024
nodename: nautobot-dev-default-5f95f8c5cd-8mkwv
machine: x86_64
clock source: unix
detected number of CPU cores: 24
current working directory: /opt/nautobot
detected binary path: /usr/local/bin/python3.11
!!! no internal routing support, rebuild with pcre support !!!
your memory page size is 4096 bytes
*** WARNING: you have enabled harakiri without post buffering. Slow upload could be rejected on post-unbuffered webservers ***
detected max file descriptor number: 1073741816
building mime-types dictionary from file /etc/mime.types...1545 entry found
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on 0.0.0.0:8080 fd 11
uwsgi socket 0 bound to TCP address 127.0.0.1:33063 (port auto-assigned) fd 10
Python version: 3.11.9 (main, Jul 3 2024, 00:12:48) [GCC 12.2.0]
--- Python VM already initialized ---
Python main interpreter initialized at 0x7f48ec8bc658
python threads support enabled
your server socket listen backlog is limited to 128 connections
your mercy for graceful operations on workers is 60 seconds
mapped 333504 bytes (325 KB) for 6 cores
*** Operational MODE: preforking+threaded ***
18:33:08.726 INFO nautobot :
Nautobot initialized!
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7f48ec8bc658 pid: 1 (default app)
spawned uWSGI master process (pid: 1)
spawned uWSGI worker 1 (pid: 9, cores: 2)
18:33:08.753 INFO nautobot.core.wsgi :
Closing existing DB and cache connections on worker 1 after uWSGI forked ...
spawned uWSGI worker 2 (pid: 10, cores: 2)
18:33:08.756 INFO nautobot.core.wsgi :
Closing existing DB and cache connections on worker 2 after uWSGI forked ...
spawned uWSGI worker 3 (pid: 12, cores: 2)
18:33:08.758 INFO nautobot.core.wsgi :
Closing existing DB and cache connections on worker 3 after uWSGI forked ...
spawned uWSGI http 1 (pid: 14)
respawned uWSGI http 1 (pid: 16)
respawned uWSGI http 1 (pid: 17)
respawned uWSGI http 1 (pid: 18)
respawned uWSGI http 1 (pid: 19)
respawned uWSGI http 1 (pid: 20)
$ kubectl -n stargate describe pod/nautobot-dev-default-5f95f8c5cd-8mkwv
Name: nautobot-dev-default-5f95f8c5cd-8mkwv
Namespace: stargate
Priority: 0
Service Account: nautobot-dev
Node: k8-node-3.dev.chtrse.com/172.16.0.199
Start Time: Fri, 02 Aug 2024 12:32:39 -0600
Labels: app.kubernetes.io/component=nautobot-default
app.kubernetes.io/instance=nautobot-dev
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=nautobot
app.kubernetes.io/version=2.2.7
helm.sh/chart=nautobot-2.1.3
pod-template-hash=5f95f8c5cd
Annotations: cni.projectcalico.org/containerID: 0ac133db666dcb4bbec7e80ae901e3ac18a9c42ad1d40bf495bf0dff85558eb3
cni.projectcalico.org/podIP: 10.42.1.178/32
cni.projectcalico.org/podIPs: 10.42.1.178/32
Status: Running
SeccompProfile: RuntimeDefault
IP: 10.42.1.178
IPs:
IP: 10.42.1.178
Controlled By: ReplicaSet/nautobot-dev-default-5f95f8c5cd
Init Containers:
nautobot-init:
Container ID: docker://74df24620319301b71ab8af441e2cc7967909fdd54d84a957109c80cb7c1b2c6
Image: ghcr.io/nautobot/nautobot:2.2.7-py3.11
Image ID: docker-pullable://ghcr.io/nautobot/nautobot@sha256:6b936558dd5e7368b2556eae4725f053346cfa802835f5076da14281d29c5f03
Port:
Normal Scheduled 13m default-scheduler Successfully assigned stargate/nautobot-dev-default-5f95f8c5cd-8mkwv to k8-node-3.dev.chtrse.com Normal Pulling 13m kubelet Pulling image "ghcr.io/nautobot/nautobot:2.2.7-py3.11" Normal Pulled 13m kubelet Successfully pulled image "ghcr.io/nautobot/nautobot:2.2.7-py3.11" in 232ms (232ms including waiting) Normal Created 13m kubelet Created container nautobot-init Normal Started 13m kubelet Started container nautobot-init Normal Pulling 13m kubelet Pulling image "ghcr.io/nautobot/nautobot:2.2.7-py3.11" Normal Pulled 13m kubelet Successfully pulled image "ghcr.io/nautobot/nautobot:2.2.7-py3.11" in 254ms (254ms including waiting) Normal Created 13m kubelet Created container nautobot Normal Started 13m kubelet Started container nautobot Warning Unhealthy 3m20s (x19 over 12m) kubelet Readiness probe failed: Get "http://10.42.1.178:8080/health/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Can some one help me.
@sivaCode It seems the issue you posted is unrelated to the issue @jdrew82 mentioned above. Can you please open a new issue if the problem persists?
According to the uWSGI folks it's advised not to use socket and http listeners together. This is discussed here. We need to update the uWSGI.ini template to not use both together.