Closed cypressf closed 2 years ago
I'm unsure as to why this would be happening. I'm stopping the web service before running the database migration, so I wouldn't expect the web service to be using "crm_db." I'm curious if there's some stuck process that's still using crm_db despite stopping the crm_backend pod. @mjbludwig
Another strange thing is the crm_backend process still seems to be running even though the crm_backend pod has exited.
Trying to kill everything running... I followed my previous steps here
https://github.com/cypressf/mit-climate-data-viz/issues/263#issuecomment-1050032952
but the web service and database are still working, because I can visit svante3.mit.edu and it works just fine. It's wild that the processes survived somehow!
lsof -P -i TCP -s TCP:LISTEN
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rootlessp 3707853 crm_website 10u IPv6 14348449 0t0 TCP *:8000 (LISTEN)
rootlessp 3707853 crm_website 12u IPv6 14348450 0t0 TCP *:8002 (LISTEN)
kill 3707853
lsof -P -i TCP -s TCP:LISTEN
[no processes using ports]
using the above kill I finally stopped the process that was serving the website api, and now the website no longer loads data. I'm concerned process is still running, however, and want to try to kill that as well if possible.
showing all remaining processes run by the crm_website
user reveals a process called climate_risk_ma
that looks a little suspicious.... why is the name climate_risk_ma
without the p
at the end? why is it still running without a pod?
ps -u crm_website xo pid,stat,start,time,comm
PID STAT STARTED TIME COMMAND
1499691 Ss 13:38:47 00:00:00 systemd
1499693 S 13:38:47 00:00:00 (sd-pam)
1499699 S 13:38:47 00:00:00 sshd
1499700 Ssl 13:38:47 00:00:00 fish
1499834 Ss 13:38:52 00:00:00 dbus-daemon
1533173 S 14:17:14 00:00:00 sshd
1533174 Ss+ 14:17:14 00:00:00 fish
1536937 R+ 14:25:24 00:00:00 ps
3705516 S Jun 24 00:00:00 catatonit
3707846 Ss Jun 24 00:00:00 fuse-overlayfs
3707848 S Jun 24 00:00:08 slirp4netns
3707868 Ssl Jun 24 00:00:00 conmon
3707881 Ss Jun 24 00:00:00 catatonit
3707895 Ss Jun 24 00:00:00 fuse-overlayfs
3707898 Ssl Jun 24 00:00:00 conmon
3750852 Ss Jun 24 00:00:00 fuse-overlayfs
3750861 Ssl Jun 24 00:00:00 conmon
3750874 Ssl Jun 24 00:27:24 climate_risk_ma
I killed it
pkill climate_risk_ma
ps -u crm_website xo pid,stat,start,time,comm
PID STAT STARTED TIME COMMAND
1499691 Ss 13:38:47 00:00:00 systemd
1499693 S 13:38:47 00:00:00 (sd-pam)
1499699 S 13:38:47 00:00:00 sshd
1499700 Ssl 13:38:47 00:00:00 fish
1499834 Ss 13:38:52 00:00:00 dbus-daemon
1533173 S 14:17:14 00:00:00 sshd
1533174 Ss+ 14:17:14 00:00:00 fish
1539087 R+ 14:26:57 00:00:00 ps
3705516 S Jun 24 00:00:00 catatonit
3707846 Ss Jun 24 00:00:00 fuse-overlayfs
3707848 S Jun 24 00:00:08 slirp4netns
3707868 Ssl Jun 24 00:00:00 conmon
3707881 Ss Jun 24 00:00:00 catatonit
3707895 Ss Jun 24 00:00:00 fuse-overlayfs
3707898 Ssl Jun 24 00:00:00 conmon
I rebuilt the containers and started the pod using the script.
cd /opt/climate_risk_map_builder
./crm_build_wrapper.sh
It executed with no errors and is running the pod including the web service successfully on svante3 again.
https://github.com/cypressf/climate-risk-map/runs/7362210004?check_suite_focus=true
I reran the deploy to development github job, and it worked this time with no "crm_db" is being accessed by other users
error. Closing this issue as fixed for now. Can reinvestigate if this error pops up again.
https://github.com/cypressf/climate-risk-map/runs/7345060227?check_suite_focus=true#step:7:29