NginxProxyManager / nginx-proxy-manager

Docker container for managing Nginx proxy hosts with a simple, powerful interface
https://nginxproxymanager.com
MIT License
21.91k stars 2.53k forks source link

s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening #790

Open sbaydush opened 3 years ago

sbaydush commented 3 years ago

Checklist

Describe the bug

Upon standing up a new NPM instance i get a few warnings about database communication issues) but I can login. As soon as i tell the stack to stop and then start it again, the database starts fine but the app container fails to start. (see screenshot)

To Reproduce Steps to reproduce the behavior: Deploy latest version using docker stack in example. Once everything is running fine, stop the stack and then start it again.

Expected behavior The app container should be able to be re-created pointing to the same volumes and function properly

Screenshots App container: image

Database container: image

Operating System Using centos 7 with docker swarm on top. I am also using Storidge for the storage plugin.

Additional context Add any other context about the problem here, docker version, browser version if applicable to the problem. Too much info is better than too little.

The only way I can get it to be fixed is to delete everything about the stack and have it re-create.

Could I be missing a volume mount?

version: "3"
services:
  app:
    image: 'jc21/nginx-proxy-manager:latest'
    restart: always
    ports:
      # Public HTTP Port:
      - '80:80'
      # Public HTTPS Port:
      - '443:443'
      # Admin Web Port:
      - '81:81'
    environment:
      # These are the settings to access your db
      DB_MYSQL_HOST: "db"
      DB_MYSQL_PORT: 3306
      DB_MYSQL_USER: "npm"
      DB_MYSQL_PASSWORD: "Password123"
      DB_MYSQL_NAME: "npm"
      # If you would rather use Sqlite uncomment this
      # and remove all DB_MYSQL_* lines above
      # DB_SQLITE_FILE: "/data/database.sqlite"
      # Uncomment this if IPv6 is not enabled on your host
      # DISABLE_IPV6: 'true'
    volumes:
      - app:/data
      - letsencrypt:/etc/letsencrypt
    depends_on:
      - db
  db:
    image: jc21/mariadb-aria:10.4
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: 'Password123'
      MYSQL_DATABASE: 'npm'
      MYSQL_USER: 'npm'
      MYSQL_PASSWORD: 'Password123'
    volumes:
      - db:/var/lib/mysql

volumes:
  db:
    driver: cio
    driver_opts:
      profile: "MYSQL"
  app:
    driver: cio
    driver_opts:
      profile: "SMALL"
  letsencrypt:
    driver: cio
    driver_opts:
      profile: "SMALL"
sbaydush commented 3 years ago

I messed around some more and sometimes the container starts successfully and some times it doesnt... image

damntourists commented 3 years ago

I am also encountering this issue while trying to run this on a swarm of raspberry pi 4's.

pixel3design-hub commented 3 years ago

I am also encountering this issue while trying to run this on a swarm of raspberry pi 4's.

Same here. Using Docker Swarm in Portainer on a RPi4-4GB

NozomiKimot commented 3 years ago

I am in the same boat, starting up a fresh instance is completely fine but any following startup attempts of the container falls flat on its face and recreating the entire instance with a blank slate is the only way to get it running again, super frustrating!

NozomiKimot commented 3 years ago

Although upon further testing I'm not entirely sure I'm encountering the exact same problem I only started encountering the issue today and the only thing I had changed was changing some hosts to start using DNS challenges for the SSL certificates. I removed those entries from my database and it starts up fine again So, for me at least, my failed startup issues are related to using DNS challenges for my SSL certs Not sure if DNS provider for the challenges make a difference but I use Cloudflare

cestnickell commented 3 years ago

Same issue here, I'm also on an RPi4 using DNS challenge (Cloudflare) for my SSL certs.

wieluk commented 3 years ago

Same Issue for me on boot up this happens but if I manually restart afterwards I don't have any problems

ggantn commented 3 years ago

Same issue for me I cannot create the container on my raspberry pi4

0x5ECF4ULT commented 3 years ago

Same issue. 4x RPi 4B Swarm Deployment using Portainer This issue does not arise if the stack is started via docker-compose. Very weird... I dug into the code a little bit and it seems this issue is related to the supervisor on http://127.0.0.1:3000/. At least this is what my logs are saying. I think the timing is kind of screwed. The container crashes after the JWT keys are generated and NGINX is restarted. My theory is that the supervisor service on http://127.0.0.1:3000/ is not running yet when restarting NGINX and that causes s6 to panic. But why would it work when starting the stack via compose then?

0x5ECF4ULT commented 3 years ago

Okay let me clear up my nonsense: The error log indicates that the service on http://127.0.0.1:3000/ is not available... this has nothing to do with s6, however I've found this which means if the frontend terminates, the s6 service in /var/run/s6/services is also terminated which causes s6 to panic. @jc21 could you please take a look at this? I'm by no means a professional user of s6 and could just talk bs. Hope my theory helps tho!

Riiskyy commented 3 years ago

I am having this issue when trying to run in a single node RPi swam even when deploying using compose

MaleNurse commented 3 years ago

Occurs for me as well, as varying points in the startup. RaspPi4 as a manager in a swarm, with the stack running through Portainer. Pulled 2.7.3 instead of latest (no other changes to the compose) and it's running fine.

ETA: nevermind. After one reboot it decided that it didn't want to start again. That version must be happy only w/ an empty config and DB.

serenewaffles commented 3 years ago

Has anyone found a workaround for this?

oMadMartigaNo commented 3 years ago

Same issue for me RPi4x2 (Ubuntu 20.10 x64) swarm and running npm through Portainer v2.1.1.

s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening,
finish: applet not found,
[cont-finish.d] executing container finish scripts...,
[cont-finish.d] done.,
[s6-finish] waiting for services.,
[s6-finish] sending all processes the TERM signal.,
[s6-finish] sending all processes the KILL signal and exiting.
version: "3"
services:
  npm:
    image: 'jc21/nginx-proxy-manager:latest'
    restart: unless-stopped
    ports:
      # Public HTTP Port:
      - '80:80'
      # Public HTTPS Port:
      - '443:443'
      # Admin Web Port:
      - '81:81'
    environment:
      DB_MYSQL_HOST: 'xxx.xxx.xxx.xxx'
      DB_MYSQL_PORT: '3306'
      DB_MYSQL_USER: 'x'
      DB_MYSQL_PASSWORD: 'x'
      DB_MYSQL_NAME: 'x'
      DISABLE_IPV6: 'true'
    volumes:
      - '/mnt/docker/npm/data:/data'
      - '/mnt/docker/npm/letsencrypt:/etc/letsencrypt'
sebabordon commented 3 years ago

Hello, my instance wasn't working. I cannot say for sure, but as soon as I truncated the audit_log on the database it started working again!

oMadMartigaNo commented 3 years ago

Hello, my instance wasn't working. I cannot say for sure, but as soon as I truncated the audit_log on the database it started working again!

Hi! Thanks for the recommendation, unfortunately it did not solve my problem. s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening

oMadMartigaNo commented 3 years ago

I think this fix my issue?! 🤔 image: 'jc21/nginx-proxy-manager:github-pr-975'

Update: Nevermind... 😞

ljpaff commented 3 years ago

Any progress with this issue? I have same issue running on rpi 4 with docker-ce

sebabordon commented 3 years ago

Hello, trying to debug this issue, I've found that truncating the certificate table on the DB finally allowed NPM to start. I had several duplicate certificates from my history from a version that had timeouts. Now I'm in the process of renewing all certificates (and purging old proxy hosts). So far So good

Kegelcizer commented 3 years ago

Same issue as OP when stopping the stack. Debian 10 and using sqlite.

I had to delete everything from the two volumes and reconfigure from scratch.

julesrulez96 commented 3 years ago

Hello, also trying to find the bug for this issue. I have a docker-stack deployed on rpi 3b+ ubuntu x 64, when starting clean means no data (db, etc) is there for the containers to start, everything works fine, but deploying the compose file as an stack (service) the nginx proxy manager runs into this issue, the db gets the communications packets error and the proxy manager the supervisor is not listening error. also have tried the sqlite solution, same problem.

Are there any solutions yet?

Rihan9 commented 3 years ago

Hello, I have the same issue. Currently, the only thing I can do is to set it to restart itself in case of an error and watch its build dozens of junk containers before starting. Could it be an execution order problem?

pvd-nerd commented 3 years ago

Experiencing the same issue on 2.9.3 & 2.9.4 w/ 4 RPi4 8GB swarm. Was previously working, restarted the docker cluster and it won't come back up. Tried completely new stacks, sometimes it will start if you just keep letting it retry.

[s6-init] making user provided files available at /var/run/s6/etc...exited 0. [s6-init] ensuring user provided files have correct perms...exited 0. [fix-attrs.d] applying ownership & permissions fixes... [fix-attrs.d] done. [cont-init.d] executing container initialization scripts... [cont-init.d] 01_s6-secret-init.sh: executing... [cont-init.d] 01_s6-secret-init.sh: exited 0. [cont-init.d] done. [services.d] starting services [services.d] done. ❯ Enabling IPV6 in hosts: /etc/nginx/conf.d ❯ /etc/nginx/conf.d/default.conf ❯ /etc/nginx/conf.d/production.conf ❯ /etc/nginx/conf.d/include/block-exploits.conf ❯ /etc/nginx/conf.d/include/letsencrypt-acme-challenge.conf ❯ /etc/nginx/conf.d/include/proxy.conf ❯ /etc/nginx/conf.d/include/assets.conf ❯ /etc/nginx/conf.d/include/ip_ranges.conf ❯ /etc/nginx/conf.d/include/ssl-ciphers.conf ❯ /etc/nginx/conf.d/include/force-ssl.conf ❯ /etc/nginx/conf.d/include/resolvers.conf ❯ Enabling IPV6 in hosts: /data/nginx [7/5/2021] [1:28:14 PM] [Global ] › ℹ info Generating MySQL db configuration from environment variables [7/5/2021] [1:28:14 PM] [Global ] › ℹ info Wrote db configuration to config file: ./config/production.json [7/5/2021] [1:28:26 PM] [Migrate ] › ℹ info Current database version: 20210210154703 [7/5/2021] [1:28:26 PM] [Setup ] › ℹ info Creating a new JWT key pair... s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening [cont-finish.d] executing container finish scripts... [cont-finish.d] done. [s6-finish] waiting for services. [s6-finish] sending all processes the TERM signal. [s6-finish] sending all processes the KILL signal and exiting.`

canuckbrian commented 3 years ago

I'm having this problem too.

Deployed as a stack in Portainer. I cannot even get it to start up successfully once as others have been able to.

Here's my compose file for the stack:

version: "3.7"
services:
  app:
    image: 'jc21/nginx-proxy-manager:latest'
    ports:
      - '80:80'
      - '81:81'
      - '443:443'
    environment:
      DB_MYSQL_HOST: "db"
      DB_MYSQL_PORT: 3306
      DB_MYSQL_USER: "npm"
      DB_MYSQL_PASSWORD: "npm"
      DB_MYSQL_NAME: "npm"
      DISABLE_IPV6: "true"
    networks:
      - nginxproxymanager
    volumes:
      - app_data:/data
      - letsencrypt:/etc/letsencrypt
    depends_on:
      - db
    restart: unless-stopped
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [node.role == worker]

  db:
    image: 'jc21/mariadb-aria:latest'
    environment:
      MYSQL_ROOT_PASSWORD: "npm"
      MYSQL_DATABASE: "npm"
      MYSQL_USER: "npm"
      MYSQL_PASSWORD: "npm"
    networks:
      - nginxproxymanager
    volumes:
      - db_data:/var/lib/mysql
    restart: unless-stopped
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [node.role == worker]

networks:
  nginxproxymanager:
    driver: overlay
    attachable: true

volumes:
  db_data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=10.10.10.10,nolock,soft,rw
      device: ":/nfs/docker/npm/db_data"
  app_data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=10.10.10.10,nolock,soft,rw
      device: ":/nfs/docker/npm/app_data"
  letsencrypt:
    driver: local
    driver_opts:
      type: nfs
      o: addr=10.10.10.10,nolock,soft,rw
      device: ":/nfs/docker/npm/letsencrypt"

App container log:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.,
[s6-init] ensuring user provided files have correct perms...exited 0.,
[fix-attrs.d] applying ownership & permissions fixes...,
[fix-attrs.d] done.,
[cont-init.d] executing container initialization scripts...,
[cont-init.d] 01_s6-secret-init.sh: executing... ,
[cont-init.d] 01_s6-secret-init.sh: exited 0.,
[cont-init.d] done.,
[services.d] starting services,
[services.d] done.,
Disabling IPV6 in hosts,
❯ Disabling IPV6 in hosts: /etc/nginx/conf.d,
  ❯ /etc/nginx/conf.d/production.conf,
  ❯ /etc/nginx/conf.d/default.conf,
  ❯ /etc/nginx/conf.d/include/letsencrypt-acme-challenge.conf,
  ❯ /etc/nginx/conf.d/include/assets.conf,
  ❯ /etc/nginx/conf.d/include/block-exploits.conf,
  ❯ /etc/nginx/conf.d/include/proxy.conf,
  ❯ /etc/nginx/conf.d/include/ssl-ciphers.conf,
  ❯ /etc/nginx/conf.d/include/ip_ranges.conf,
  ❯ /etc/nginx/conf.d/include/force-ssl.conf,
  ❯ /etc/nginx/conf.d/include/resolvers.conf,
Disabling IPV6 in hosts,
❯ Disabling IPV6 in hosts: /data/nginx,
[7/16/2021] [5:08:40 PM] [Global   ] › ℹ  info      Generating MySQL db configuration from environment variables,
[7/16/2021] [5:08:40 PM] [Global   ] › ℹ  info      Wrote db configuration to config file: ./config/production.json,
[7/16/2021] [5:08:41 PM] [Migrate  ] › ℹ  info      Current database version: 20210210154703,
[7/16/2021] [5:08:41 PM] [Setup    ] › ℹ  info      Creating a new JWT key pair...,
[7/16/2021] [5:08:49 PM] [Setup    ] › ℹ  info      Wrote JWT key pair to config file: /app/config/production.json,
[7/16/2021] [5:08:49 PM] [IP Ranges] › ℹ  info      Fetching IP Ranges from online services...,
[7/16/2021] [5:08:49 PM] [IP Ranges] › ℹ  info      Fetching https://ip-ranges.amazonaws.com/ip-ranges.json,
[7/16/2021] [5:08:57 PM] [IP Ranges] › ℹ  info      Fetching https://www.cloudflare.com/ips-v4,
s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening,
[cont-finish.d] executing container finish scripts...,
[cont-finish.d] done.,
[s6-finish] waiting for services.,
[s6-finish] sending all processes the TERM signal.,
[s6-finish] sending all processes the KILL signal and exiting.

DB container log shows these messages every time the app container fails:

2021-07-16 17:21:43 78 [Warning] Aborted connection 78 to db: 'npm' user: 'npm' host: '10.0.13.157' (Got an error reading communication packets),
2021-07-16 17:22:15 79 [Warning] Aborted connection 79 to db: 'npm' user: 'npm' host: '10.0.13.159' (Got an error reading communication packets),
2021-07-16 17:22:46 80 [Warning] Aborted connection 80 to db: 'npm' user: 'npm' host: '10.0.13.161' (Got an error reading communication packets),
2021-07-16 17:23:14 81 [Warning] Aborted connection 81 to db: 'npm' user: 'npm' host: '10.0.13.163' (Got an error reading communication packets),
2021-07-16 17:23:42 82 [Warning] Aborted connection 82 to db: 'npm' user: 'npm' host: '10.0.13.165' (Got an error reading communication packets)

Any help would be appreciated.

comdr-chocchip commented 2 years ago

Yeah same issue here. was trying to figure out why i was getting bad gateway error at the login page, my logs look the same

abhinav-TB commented 2 years ago

I had the same issue, it is working fine when I started the container again, but I would like the team to look into a permanent solution so that it does not go down on crucial times

luandro commented 2 years ago

Getting this when running offline on a Rasp 4.

vdiogo commented 2 years ago

Same issue when running the "Check Home Assistant configuration"

tablatronix commented 2 years ago

Anyone know what this issue is? Same here, was working, now wont, same error

Kegelcizer commented 2 years ago

I still have this issue with 2.9.13-18. Reverted back to 2.9.12 since it's the latest version that works for me.

othyn commented 2 years ago

For me this issue was related to having a port already bound on the host that the container was trying (and failing) to bind to. The error message from Docker Compose was far more descriptive of the bind failure than the errors from within the container when I stepped backwards up the chain, as it clearly showed the port bind error.

I tracked down the host service using the port and killed it (as it wasn't meant to be using it!), this then allowed the container to boot just fine and bind to the required port.

Specifically in this case, Unraid's VM virbr0 networking had taken over port 53 on the host. This wasn't desired, so I shifted the VM networking to br0 and restarted the container. Both were then happy as PiHole could bind to port 53. (I know this wasn't related to PiHole, but the issue appears to be the same.)

tablatronix commented 2 years ago

Mine started doing this again, still no idea what nproxy | s6-svscanctl: fatal: unable to control /var/run/s6/services: supervisor not listening means

EDIT: I got rid of that error, but still get bad gateway 502..

I give up , I looked at ports didn't really see anything in conflict

mureithimaina commented 1 year ago

I have the same issue, after updating and restarting Debian 11, looking at Traefik now

gering commented 1 year ago

Same here, running via Portainer on Synology Diskstation. Happens every morning. Redeploy then fixes it. Probably related to some cron for the cert refresh. Using DNS challenge btw.

github-actions[bot] commented 5 months ago

Issue is now considered stale. If you want to keep it open, please comment :+1: