Open sbaydush opened 3 years ago
I messed around some more and sometimes the container starts successfully and some times it doesnt...
I am also encountering this issue while trying to run this on a swarm of raspberry pi 4's.
I am also encountering this issue while trying to run this on a swarm of raspberry pi 4's.
Same here. Using Docker Swarm in Portainer on a RPi4-4GB
I am in the same boat, starting up a fresh instance is completely fine but any following startup attempts of the container falls flat on its face and recreating the entire instance with a blank slate is the only way to get it running again, super frustrating!
Although upon further testing I'm not entirely sure I'm encountering the exact same problem I only started encountering the issue today and the only thing I had changed was changing some hosts to start using DNS challenges for the SSL certificates. I removed those entries from my database and it starts up fine again So, for me at least, my failed startup issues are related to using DNS challenges for my SSL certs Not sure if DNS provider for the challenges make a difference but I use Cloudflare
Same issue here, I'm also on an RPi4 using DNS challenge (Cloudflare) for my SSL certs.
Same Issue for me on boot up this happens but if I manually restart afterwards I don't have any problems
Same issue for me I cannot create the container on my raspberry pi4
Same issue.
4x RPi 4B Swarm
Deployment using Portainer
This issue does not arise if the stack is started via docker-compose
. Very weird...
I dug into the code a little bit and it seems this issue is related to the supervisor on http://127.0.0.1:3000/
. At least this is what my logs are saying. I think the timing is kind of screwed. The container crashes after the JWT keys are generated and NGINX is restarted.
My theory is that the supervisor service on http://127.0.0.1:3000/
is not running yet when restarting NGINX and that causes s6 to panic. But why would it work when starting the stack via compose then?
Okay let me clear up my nonsense:
The error log indicates that the service on http://127.0.0.1:3000/
is not available... this has nothing to do with s6, however I've found this which means if the frontend terminates, the s6 service in /var/run/s6/services
is also terminated which causes s6 to panic. @jc21 could you please take a look at this? I'm by no means a professional user of s6 and could just talk bs. Hope my theory helps tho!
I am having this issue when trying to run in a single node RPi swam even when deploying using compose
Occurs for me as well, as varying points in the startup. RaspPi4 as a manager in a swarm, with the stack running through Portainer. Pulled 2.7.3 instead of latest (no other changes to the compose) and it's running fine.
ETA: nevermind. After one reboot it decided that it didn't want to start again. That version must be happy only w/ an empty config and DB.
Has anyone found a workaround for this?
Same issue for me RPi4x2 (Ubuntu 20.10 x64) swarm and running npm through Portainer v2.1.1.
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening,
finish: applet not found,
[cont-finish.d] executing container finish scripts...,
[cont-finish.d] done.,
[s6-finish] waiting for services.,
[s6-finish] sending all processes the TERM signal.,
[s6-finish] sending all processes the KILL signal and exiting.
version: "3"
services:
npm:
image: 'jc21/nginx-proxy-manager:latest'
restart: unless-stopped
ports:
# Public HTTP Port:
- '80:80'
# Public HTTPS Port:
- '443:443'
# Admin Web Port:
- '81:81'
environment:
DB_MYSQL_HOST: 'xxx.xxx.xxx.xxx'
DB_MYSQL_PORT: '3306'
DB_MYSQL_USER: 'x'
DB_MYSQL_PASSWORD: 'x'
DB_MYSQL_NAME: 'x'
DISABLE_IPV6: 'true'
volumes:
- '/mnt/docker/npm/data:/data'
- '/mnt/docker/npm/letsencrypt:/etc/letsencrypt'
Hello, my instance wasn't working. I cannot say for sure, but as soon as I truncated the audit_log on the database it started working again!
Hello, my instance wasn't working. I cannot say for sure, but as soon as I truncated the audit_log on the database it started working again!
Hi!
Thanks for the recommendation, unfortunately it did not solve my problem.
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
I think this fix my issue?! 🤔
image: 'jc21/nginx-proxy-manager:github-pr-975'
Update: Nevermind... 😞
Any progress with this issue? I have same issue running on rpi 4 with docker-ce
Hello, trying to debug this issue, I've found that truncating the certificate table on the DB finally allowed NPM to start. I had several duplicate certificates from my history from a version that had timeouts. Now I'm in the process of renewing all certificates (and purging old proxy hosts). So far So good
Same issue as OP when stopping the stack. Debian 10 and using sqlite.
I had to delete everything from the two volumes and reconfigure from scratch.
Hello, also trying to find the bug for this issue. I have a docker-stack deployed on rpi 3b+ ubuntu x 64, when starting clean means no data (db, etc) is there for the containers to start, everything works fine, but deploying the compose file as an stack (service) the nginx proxy manager runs into this issue, the db gets the communications packets error and the proxy manager the supervisor is not listening error. also have tried the sqlite solution, same problem.
Are there any solutions yet?
Hello, I have the same issue. Currently, the only thing I can do is to set it to restart itself in case of an error and watch its build dozens of junk containers before starting. Could it be an execution order problem?
Experiencing the same issue on 2.9.3 & 2.9.4 w/ 4 RPi4 8GB swarm. Was previously working, restarted the docker cluster and it won't come back up. Tried completely new stacks, sometimes it will start if you just keep letting it retry.
[s6-init] making user provided files available at /var/run/s6/etc...exited 0. [s6-init] ensuring user provided files have correct perms...exited 0. [fix-attrs.d] applying ownership & permissions fixes... [fix-attrs.d] done. [cont-init.d] executing container initialization scripts... [cont-init.d] 01_s6-secret-init.sh: executing... [cont-init.d] 01_s6-secret-init.sh: exited 0. [cont-init.d] done. [services.d] starting services [services.d] done. ❯ Enabling IPV6 in hosts: /etc/nginx/conf.d ❯ /etc/nginx/conf.d/default.conf ❯ /etc/nginx/conf.d/production.conf ❯ /etc/nginx/conf.d/include/block-exploits.conf ❯ /etc/nginx/conf.d/include/letsencrypt-acme-challenge.conf ❯ /etc/nginx/conf.d/include/proxy.conf ❯ /etc/nginx/conf.d/include/assets.conf ❯ /etc/nginx/conf.d/include/ip_ranges.conf ❯ /etc/nginx/conf.d/include/ssl-ciphers.conf ❯ /etc/nginx/conf.d/include/force-ssl.conf ❯ /etc/nginx/conf.d/include/resolvers.conf ❯ Enabling IPV6 in hosts: /data/nginx [7/5/2021] [1:28:14 PM] [Global ] › ℹ info Generating MySQL db configuration from environment variables [7/5/2021] [1:28:14 PM] [Global ] › ℹ info Wrote db configuration to config file: ./config/production.json [7/5/2021] [1:28:26 PM] [Migrate ] › ℹ info Current database version: 20210210154703 [7/5/2021] [1:28:26 PM] [Setup ] › ℹ info Creating a new JWT key pair... s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening [cont-finish.d] executing container finish scripts... [cont-finish.d] done. [s6-finish] waiting for services. [s6-finish] sending all processes the TERM signal. [s6-finish] sending all processes the KILL signal and exiting.`
I'm having this problem too.
Deployed as a stack in Portainer. I cannot even get it to start up successfully once as others have been able to.
Here's my compose file for the stack:
version: "3.7"
services:
app:
image: 'jc21/nginx-proxy-manager:latest'
ports:
- '80:80'
- '81:81'
- '443:443'
environment:
DB_MYSQL_HOST: "db"
DB_MYSQL_PORT: 3306
DB_MYSQL_USER: "npm"
DB_MYSQL_PASSWORD: "npm"
DB_MYSQL_NAME: "npm"
DISABLE_IPV6: "true"
networks:
- nginxproxymanager
volumes:
- app_data:/data
- letsencrypt:/etc/letsencrypt
depends_on:
- db
restart: unless-stopped
deploy:
mode: replicated
replicas: 1
placement:
constraints: [node.role == worker]
db:
image: 'jc21/mariadb-aria:latest'
environment:
MYSQL_ROOT_PASSWORD: "npm"
MYSQL_DATABASE: "npm"
MYSQL_USER: "npm"
MYSQL_PASSWORD: "npm"
networks:
- nginxproxymanager
volumes:
- db_data:/var/lib/mysql
restart: unless-stopped
deploy:
mode: replicated
replicas: 1
placement:
constraints: [node.role == worker]
networks:
nginxproxymanager:
driver: overlay
attachable: true
volumes:
db_data:
driver: local
driver_opts:
type: nfs
o: addr=10.10.10.10,nolock,soft,rw
device: ":/nfs/docker/npm/db_data"
app_data:
driver: local
driver_opts:
type: nfs
o: addr=10.10.10.10,nolock,soft,rw
device: ":/nfs/docker/npm/app_data"
letsencrypt:
driver: local
driver_opts:
type: nfs
o: addr=10.10.10.10,nolock,soft,rw
device: ":/nfs/docker/npm/letsencrypt"
App container log:
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.,
[s6-init] ensuring user provided files have correct perms...exited 0.,
[fix-attrs.d] applying ownership & permissions fixes...,
[fix-attrs.d] done.,
[cont-init.d] executing container initialization scripts...,
[cont-init.d] 01_s6-secret-init.sh: executing... ,
[cont-init.d] 01_s6-secret-init.sh: exited 0.,
[cont-init.d] done.,
[services.d] starting services,
[services.d] done.,
Disabling IPV6 in hosts,
❯ Disabling IPV6 in hosts: /etc/nginx/conf.d,
❯ /etc/nginx/conf.d/production.conf,
❯ /etc/nginx/conf.d/default.conf,
❯ /etc/nginx/conf.d/include/letsencrypt-acme-challenge.conf,
❯ /etc/nginx/conf.d/include/assets.conf,
❯ /etc/nginx/conf.d/include/block-exploits.conf,
❯ /etc/nginx/conf.d/include/proxy.conf,
❯ /etc/nginx/conf.d/include/ssl-ciphers.conf,
❯ /etc/nginx/conf.d/include/ip_ranges.conf,
❯ /etc/nginx/conf.d/include/force-ssl.conf,
❯ /etc/nginx/conf.d/include/resolvers.conf,
Disabling IPV6 in hosts,
❯ Disabling IPV6 in hosts: /data/nginx,
[7/16/2021] [5:08:40 PM] [Global ] › ℹ info Generating MySQL db configuration from environment variables,
[7/16/2021] [5:08:40 PM] [Global ] › ℹ info Wrote db configuration to config file: ./config/production.json,
[7/16/2021] [5:08:41 PM] [Migrate ] › ℹ info Current database version: 20210210154703,
[7/16/2021] [5:08:41 PM] [Setup ] › ℹ info Creating a new JWT key pair...,
[7/16/2021] [5:08:49 PM] [Setup ] › ℹ info Wrote JWT key pair to config file: /app/config/production.json,
[7/16/2021] [5:08:49 PM] [IP Ranges] › ℹ info Fetching IP Ranges from online services...,
[7/16/2021] [5:08:49 PM] [IP Ranges] › ℹ info Fetching https://ip-ranges.amazonaws.com/ip-ranges.json,
[7/16/2021] [5:08:57 PM] [IP Ranges] › ℹ info Fetching https://www.cloudflare.com/ips-v4,
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening,
[cont-finish.d] executing container finish scripts...,
[cont-finish.d] done.,
[s6-finish] waiting for services.,
[s6-finish] sending all processes the TERM signal.,
[s6-finish] sending all processes the KILL signal and exiting.
DB container log shows these messages every time the app container fails:
2021-07-16 17:21:43 78 [Warning] Aborted connection 78 to db: 'npm' user: 'npm' host: '10.0.13.157' (Got an error reading communication packets),
2021-07-16 17:22:15 79 [Warning] Aborted connection 79 to db: 'npm' user: 'npm' host: '10.0.13.159' (Got an error reading communication packets),
2021-07-16 17:22:46 80 [Warning] Aborted connection 80 to db: 'npm' user: 'npm' host: '10.0.13.161' (Got an error reading communication packets),
2021-07-16 17:23:14 81 [Warning] Aborted connection 81 to db: 'npm' user: 'npm' host: '10.0.13.163' (Got an error reading communication packets),
2021-07-16 17:23:42 82 [Warning] Aborted connection 82 to db: 'npm' user: 'npm' host: '10.0.13.165' (Got an error reading communication packets)
Any help would be appreciated.
Yeah same issue here. was trying to figure out why i was getting bad gateway error at the login page, my logs look the same
I had the same issue, it is working fine when I started the container again, but I would like the team to look into a permanent solution so that it does not go down on crucial times
Getting this when running offline on a Rasp 4.
Same issue when running the "Check Home Assistant configuration"
Anyone know what this issue is? Same here, was working, now wont, same error
I still have this issue with 2.9.13-18. Reverted back to 2.9.12 since it's the latest version that works for me.
For me this issue was related to having a port already bound on the host that the container was trying (and failing) to bind to. The error message from Docker Compose was far more descriptive of the bind failure than the errors from within the container when I stepped backwards up the chain, as it clearly showed the port bind error.
I tracked down the host service using the port and killed it (as it wasn't meant to be using it!), this then allowed the container to boot just fine and bind to the required port.
Specifically in this case, Unraid's VM virbr0 networking had taken over port 53 on the host. This wasn't desired, so I shifted the VM networking to br0 and restarted the container. Both were then happy as PiHole could bind to port 53. (I know this wasn't related to PiHole, but the issue appears to be the same.)
Mine started doing this again, still no idea what nproxy | s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening
means
EDIT: I got rid of that error, but still get bad gateway 502..
I give up , I looked at ports didn't really see anything in conflict
I have the same issue, after updating and restarting Debian 11, looking at Traefik now
Same here, running via Portainer on Synology Diskstation. Happens every morning. Redeploy then fixes it. Probably related to some cron for the cert refresh. Using DNS challenge btw.
Issue is now considered stale. If you want to keep it open, please comment :+1:
Checklist
jc21/nginx-proxy-manager:latest
docker image? YesDescribe the bug
Upon standing up a new NPM instance i get a few warnings about database communication issues) but I can login. As soon as i tell the stack to stop and then start it again, the database starts fine but the app container fails to start. (see screenshot)
To Reproduce Steps to reproduce the behavior: Deploy latest version using docker stack in example. Once everything is running fine, stop the stack and then start it again.
Expected behavior The app container should be able to be re-created pointing to the same volumes and function properly
Screenshots App container:
Database container:
Operating System Using centos 7 with docker swarm on top. I am also using Storidge for the storage plugin.
Additional context Add any other context about the problem here, docker version, browser version if applicable to the problem. Too much info is better than too little.
The only way I can get it to be fixed is to delete everything about the stack and have it re-create.
Could I be missing a volume mount?