Closed homework36 closed 1 month ago
I'm able to launch with docker-compose -f production.yml up -d
and all containers are "healthy" now but rodan2.simssa.ca is still not accessible.
The default command (swarm mode) docker stack deploy --with-registry-auth -c production.yml rodan
still leads to failing containers.
checked but still get 502 Bad Gateway:
production.env
is modified.rodan-client/config/configuration.json
accordingly.see updates in #1149
At first I thought this is an Nginx thing as in #1142, but starting Nginx manually inside the container I got
[emerg] host not found in upstream
and logwait-for-app: timeout occurred after waiting 15 seconds for iipsrv:9003
. Checked again and I found that rodan-main would fail after several minutes (even when I set the docker container to be idle) and containers for celery jobs did not launch at all. This happens to both new VMs (with GPU and vGPU). My speculation is that celery and Nginx all depend on rodan-main, which is not working.docker logs
for rodan-main indicates that the container stops at this line.*Maybe it is related to the new OS and GPU, but I'm not sure. Need to investigate further and figure out the problem.
Updated all environment variables and now: In the rare case rodan-python3-celery did launch and terminated with following log message:
and rodan-main has this error msg:
But postgres-plpython is healthy and is giving desired output. After restarting, rodan-main is giving the same log as above (*).