spantaleev / matrix-docker-ansible-deploy

🐳 Matrix (An open network for secure, decentralized communication) server setup using Ansible and Docker
GNU Affero General Public License v3.0
4.74k stars 1.02k forks source link

Nginx cannot start after reboot server #2077

Closed Seele-Vollerei32 closed 5 months ago

Seele-Vollerei32 commented 2 years ago

Playbook Configuration:

My vars.yml file looks like this:

---
# The bare domain name which represents your Matrix identity.
# Matrix user ids for your server will be of the form (`@user:<matrix-domain>`).
#
# Note: this playbook does not touch the server referenced here.
# Installation happens on another server ("matrix.<matrix-domain>").
#
# If you've deployed using the wrong domain, you'll have to run the Uninstalling step,
# because you can't change the Domain after deployment.
#
# Example value: example.com
matrix_domain: atunemic.cn

# The Matrix homeserver software to install.
# See `roles/matrix-base/defaults/main.yml` for valid options.
matrix_homeserver_implementation: dendrite

# A secret used as a base, for generating various other secrets.
# You can put any string here, but generating a strong one is preferred (e.g. `pwgen -s 64 1`).
matrix_homeserver_generic_secret_key: '***'

# This is something which is provided to Let's Encrypt when retrieving SSL certificates for domains.
#
# In case SSL renewal fails at some point, you'll also get an email notification there.
#
# If you decide to use another method for managing SSL certificates (different than the default Let's Encrypt),
# you won't be required to define this variable (see `docs/configuring-playbook-ssl-certificates.md`).
#
# Example value: someone@example.com
matrix_ssl_lets_encrypt_support_email: 'we123445@outlook.com'

# A Postgres password to use for the superuser Postgres user (called `matrix` by default).
#
# The playbook creates additional Postgres users and databases (one for each enabled service)
# using this superuser account.
matrix_postgres_connection_password: '***'

matrix_synapse_enable_registration: true
matrix_synapse_registration_requires_token: true
matrix_synapse_registrations_require_3pid: 'email'

matrix_prometheus_enabled: false

matrix_prometheus_node_exporter_enabled: false

matrix_grafana_enabled: false

matrix_grafana_anonymous_access: false

# This has no relation to your Matrix user id. It can be any username you'd like.
# Changing the username subsequently won't work.
matrix_grafana_default_admin_user: "kevin"

# Changing the password subsequently won't work.
matrix_grafana_default_admin_password: "***"

matrix_synapse_admin_enabled: flase

matrix_synapse_ext_password_provider_shared_secret_auth_enabled: false
matrix_synapse_ext_password_provider_shared_secret_auth_shared_secret: ***

matrix_bot_mjolnir_enabled: true
matrix_bot_mjolnir_access_token: "***"
matrix_bot_mjolnir_management_room: "!AVjqHyfcl6BsDRTO:atunemic.cn"
matrix_synapse_ext_spam_checker_mjolnir_antispam_enabled: true
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_block_invites: false
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_block_messages: false
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_block_usernames: false
matrix_synapse_ext_spam_checker_mjolnir_antispam_config_ban_lists: []

matrix_mautrix_telegram_enabled: false
matrix_mautrix_telegram_api_id: 9609852
matrix_mautrix_telegram_api_hash: ***
matrix_mautrix_telegram_bot_token: ***
matrix_mautrix_telegram_configuration_extension_yaml: |
  bridge:
    permissions:
      '*': relaybot
      '@kevin_liu:atunemic.cn': admin

matrix_dimension_enabled: true
matrix_dimension_access_token: "***"
matrix_dimension_admins:
  - "@kevin_liu:{{ matrix_domain }}"

matrix_s3_media_store_enabled: false
matrix_s3_media_store_bucket_name: "matrix-1302020253"
matrix_s3_media_store_aws_access_key: "***"
matrix_s3_media_store_aws_secret_key: "***"
matrix_s3_media_store_custom_endpoint_enabled: true
# Example: "https://storage.googleapis.com"
matrix_s3_media_store_custom_endpoint: "***"

matrix_bot_matrix_registration_bot_enabled: true
# Token obtained via logging into the bot account (see above)
matrix_bot_matrix_registration_bot_bot_access_token: "***"

# Enables registration
matrix_synapse_enable_registration: true

# Restrict registration to users with a token
matrix_synapse_registration_requires_token: true

matrix_ma1sd_enabled: true

matrix_synapse_log_level: "INFO"
matrix_synapse_storage_sql_log_level: "INFO"
matrix_synapse_root_log_level: "INFO"

Matrix Server:

Problem description:

Before I reboot my server, the webui is unable to open. I reboot the server because I thought the load of the server is too heavy for the server to run. But after reboot, it still can't open.

Additional context `journalctl -fu matrix-nginx-proxy.service

Aug 30 16:40:57 archlinux systemd[1]: matrix-nginx-proxy.service: Scheduled restart job, restart counter is at 71.
Aug 30 16:40:57 archlinux systemd[1]: Stopped Matrix nginx-proxy server.
Aug 30 16:40:57 archlinux systemd[1]: Starting Matrix nginx-proxy server...
Aug 30 16:40:57 archlinux systemd[1]: Started Matrix nginx-proxy server.
Aug 30 16:40:57 archlinux matrix-nginx-proxy[22486]: docker: Error response from daemon: driver failed programming external connectivity on endpoint matrix-nginx-proxy (87aad82eee715c36c5c704e9b17f295b0e7f3fff8ef4ebceb9705197d88cb30d): Bind for 0.0.0.0:8448 failed: port is already allocated.
Aug 30 16:40:57 archlinux systemd[1]: matrix-nginx-proxy.service: Main process exited, code=exited, status=125/n/a
Aug 30 16:40:57 archlinux systemd[1]: matrix-nginx-proxy.service: Failed with result 'exit-code'.

in this cycle.

spantaleev commented 2 years ago

See what else could be occupying port 8448 and preventing matrix-nginx-proxy.service from starting.

netstat -anp | grep :8448 may help.

Perhaps you had a manually installed Synapse in the past?

Seele-Vollerei32 commented 2 years ago

I installed Dendrite manually in the past(from AUR) And there is the output of netstart

tcp        0      0 0.0.0.0:8448            0.0.0.0:*               LISTEN      992/docker-proxy    
tcp6       0      0 :::8448                 :::*                    LISTEN      997/docker-proxy

I have stop all the service of matrix by ansible-playbook -i inventory/hosts setup.yml --tags=stop

davidisaaclee commented 1 year ago

xI am encountering what I believe to be the same issue.

@Seele-Vollerei32 – Did you find a solution? I'm also curious: is your server pretty low-powered (low memory / CPU)?

Some more details of my issue:

Happy to make a new issue, but this does sound like the same issue.

vars.yml ```yaml --- matrix_domain: earthchat.online matrix_homeserver_implementation: synapse matrix_homeserver_generic_secret_key: 'redacted' matrix_ssl_lets_encrypt_support_email: 'redacted' devture_postgres_connection_password: 'redacted' matrix_synapse_admin_enabled: true matrix_sygnal_enabled: true matrix_sygnal_apps: 'redacted' # Disable non-required services matrix_ma1sd_enabled: false matrix_mailer_enabled: false matrix_coturn_enabled: false matrix_playbook_reverse_proxy_type: playbook-managed-nginx # also tried with: # matrix_playbook_reverse_proxy_type: playbook-managed-traefik # devture_traefik_config_certificatesResolvers_acme_email: 'redacted' ```
journalctl -fu matrix-nginx-proxy.service (repeating) ``` Mar 30 16:53:48 synapse-avenue systemd[1]: Started Matrix nginx-proxy server. Mar 30 16:53:49 synapse-avenue matrix-nginx-proxy[249953]: time="2023-03-30T16:53:49Z" level=error msg="error waiting for container: context canceled" Mar 30 16:53:49 synapse-avenue matrix-nginx-proxy[249953]: Error response from daemon: driver failed programming external connectivity on endpoint matrix-nginx-proxy (7139dbe699f9e7d414e3eea5d3413dc401f7c88f27c60f6e4fdb125c0bc7a473): Bind for 0.0.0.0:8448 failed: port is already allocated Mar 30 16:53:49 synapse-avenue systemd[1]: matrix-nginx-proxy.service: Main process exited, code=exited, status=1/FAILURE Mar 30 16:53:49 synapse-avenue systemd[1]: matrix-nginx-proxy.service: Failed with result 'exit-code'. Mar 30 16:54:19 synapse-avenue systemd[1]: matrix-nginx-proxy.service: Scheduled restart job, restart counter is at 118. Mar 30 16:54:19 synapse-avenue systemd[1]: Stopped Matrix nginx-proxy server. Mar 30 16:54:19 synapse-avenue systemd[1]: Starting Matrix nginx-proxy server... Mar 30 16:54:20 synapse-avenue matrix-nginx-proxy[250035]: 3ede0cf7b5554906135a5060c094f86b7fdc5cbdc617bcb4813fa6b3c51ca8e7 Mar 30 16:54:20 synapse-avenue systemd[1]: Started Matrix nginx-proxy server. ```
journalctl -fu matrix-traefik.service Very similar to nginx above ``` Mar 30 18:29:53 synapse-avenue systemd[1]: matrix-traefik.service: Failed with result 'exit-code'. Mar 30 18:30:23 synapse-avenue systemd[1]: matrix-traefik.service: Scheduled restart job, restart counter is at 1777. Mar 30 18:30:23 synapse-avenue systemd[1]: Stopped Traefik (matrix-traefik). Mar 30 18:30:23 synapse-avenue systemd[1]: Starting Traefik (matrix-traefik)... Mar 30 18:30:24 synapse-avenue matrix-traefik[271921]: 2c521dd7c60ee481943eb4235757f020b4c7840cb316ec9a10374b2d7adb4515 Mar 30 18:30:24 synapse-avenue systemd[1]: Started Traefik (matrix-traefik). Mar 30 18:30:24 synapse-avenue matrix-traefik[271933]: Error response from daemon: driver failed programming external connectivity on endpoint matrix-traefik (2cabc67bfdf14556207a53f1a5990a81be00c17eaaa9d7ec81768d190ada94c5): Bind for 0.0.0.0:8448 failed: port is already allocated Mar 30 18:30:24 synapse-avenue systemd[1]: matrix-traefik.service: Main process exited, code=exited, status=1/FAILURE Mar 30 18:30:24 synapse-avenue systemd[1]: matrix-traefik.service: Failed with result 'exit-code'. ```
journalctl -fu matrix-container-socket-proxy.service ``` -- Logs begin at Fri 2023-03-17 08:25:14 UTC. -- Mar 30 15:51:39 synapse-avenue systemd[1]: matrix-container-socket-proxy.service: Main process exited, code=exited, status=137/n/a Mar 30 15:51:39 synapse-avenue systemd[1]: matrix-container-socket-proxy.service: Failed with result 'exit-code'. Mar 30 15:51:39 synapse-avenue systemd[1]: Stopped Container Socket Proxy (matrix-container-socket-proxy). Mar 30 18:24:32 synapse-avenue systemd[1]: Starting Container Socket Proxy (matrix-container-socket-proxy)... Mar 30 18:24:34 synapse-avenue matrix-container-socket-proxy[269618]: e118ffac4824afe0e1aaeaca1e25c947c163cd78105a5cbac241fffb37c00b13 Mar 30 18:24:34 synapse-avenue systemd[1]: Started Container Socket Proxy (matrix-container-socket-proxy). Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: [WARNING] 088/182438 (1) : Can't open server state file '/var/lib/haproxy/server-state': No such file or directory Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: [NOTICE] 088/182438 (1) : New worker #1 (7) forked Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: Proxy dockerbackend started. Mar 30 18:24:38 synapse-avenue matrix-container-socket-proxy[269625]: Proxy dockerfrontend started. ```
Failure of `just setup-all` ``` TASK [galaxy/com.devture.ansible.role.systemd_service_manager : Fail if service isn't detected to be running] *** failed: [matrix.earthchat.online] (item=matrix-traefik.service) => changed=false ansible_loop_var: item item: matrix-traefik.service msg: matrix-traefik.service was not detected to be running. It's possible that there's a configuration problem or another service on your server interferes with it (uses the same ports, etc.). Try running `systemctl status matrix-traefik.service` and `journalctl -fu matrix-traefik.service` on the server to investigate. If you're on a slow or overloaded server, it may be that services take a longer time to start and that this error is a false-positive. You can consider raising the value of the `devture_systemd_service_manager_up_verification_delay_seconds` variable. See `/redacted/matrix-docker-ansible-deploy/roles/galaxy/com.devture.ansible.role.systemd_service_manager/defaults/main.yml` for more details about that. ```
davidisaaclee commented 1 year ago

I could not figure out what the issue was, but migrating to a new instance by following https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/master/docs/maintenance-migrating.md got me back up and running :(

artu-ole commented 1 year ago

Had the same issue, for me traefik service would not start due to port being allocated already. @davidisaaclee's comment pretty much summed up all the symptoms. I was running on a 1gb instance(t3a.micro ec2) and had failed setup-all's as well which resulted is such broken state. That was a new install, so I didn't need to preserve any configs and after docker system prune -a, removing /matrix/ and adding a 2gb swap file the install went without a hitch.

luschmar commented 1 year ago

I had the very same issue. - But I had some legacy configs in following folder on the host: /matrix/nginx-proxy/conf.d/ - first I deleted there everything. After this I noticed some network in docker network ls seemed odd. After running ansible-playbook -i inventory/hosts setup.yml --tags=stop - I ran on the hostsystem docker network prune Now everything works.

gouthamravee commented 1 year ago

Had a similar issue to this today, everything seemed to be working but the built in traefik container kept crashing because of the error

Error response from daemon: driver failed programming external connectivity on endpoint matrix-traefik (---): Bind for 0.0.0.0:8448 failed: port is already allocated

Turns out docker-proxy was using that port for some reason, restarting docker as a whole fixed the issue. Just to be safe though I did a docker system prune --all --volume to delete all the containers and networks and start over.