spantaleev / matrix-docker-ansible-deploy

🐳 Matrix (An open network for secure, decentralized communication) server setup using Ansible and Docker
GNU Affero General Public License v3.0
4.92k stars 1.05k forks source link

A dependency job for matrix-coturn.service failed #3085

Open sati-bodhi opened 10 months ago

sati-bodhi commented 10 months ago

Playbook Configuration:

My vars.yml file looks like this:

---
# The bare domain name which represents your Matrix identity.
# Matrix user ids for your server will be of the form (`@user:<matrix-domain>`).
#
# Note: this playbook does not touch the server referenced here.
# Installation happens on another server ("matrix.<matrix-domain>").
#
# If you've deployed using the wrong domain, you'll have to run the Uninstalling step,
# because you can't change the Domain after deployment.
#
# Example value: example.com
matrix_domain: wencai.org

# The Matrix homeserver software to install.
# See:
#  - `roles/custom/matrix-base/defaults/main.yml` for valid options
# - the `docs/configuring-playbook-IMPLEMENTATION_NAME.md` documentation page, if one is available for your implementation choice
matrix_homeserver_implementation: synapse

# A secret used as a base, for generating various other secrets.
# You can put any string here, but generating a strong one is preferred (e.g. `pwgen -s 64 1`).
matrix_homeserver_generic_secret_key: 'mysecret'

# By default, the playbook manages its own Traefik (https://doc.traefik.io/traefik/) reverse-proxy server.
# It will retrieve SSL certificates for you on-demand and forward requests to all other components.
# For alternatives, see `docs/configuring-playbook-own-webserver.md`.
matrix_playbook_reverse_proxy_type: playbook-managed-traefik

# Ensure that public urls use https
matrix_playbook_ssl_enabled: true

# Enable the web-secure (port 443) endpoint, which enables SSL certificate retrieval
devture_traefik_config_entrypoint_web_secure_enabled: true

devture_traefik_config_certificatesResolvers_acme_use_staging: false

# This is something which is provided to Let's Encrypt when retrieving SSL certificates for domains.
#
# If you decide to use another method for managing SSL certificates (different than the default Let's Encrypt),
# you won't be required to define this variable (see `docs/configuring-playbook-ssl-certificates.md`).
#
# Example value: someone@example.com
devture_traefik_config_certificatesResolvers_acme_email: 'sati@wencai.org'

# A Postgres password to use for the superuser Postgres user (called `matrix` by default).
#
# The playbook creates additional Postgres users and databases (one for each enabled service)
# using this superuser account.
devture_postgres_connection_password: 'mypassword'

# By default, we configure Coturn's external IP address using the value specified for `ansible_host` in your `inventory/hosts` file.
# If this value is an external IP address, you can skip this section.
#
# If `ansible_host` is not the server's external IP address, you have 2 choices:
# 1. Uncomment the line below, to allow IP address auto-detection to happen (more on this below)
# 2. Uncomment and adjust the line below to specify an IP address manually
#
# By default, auto-detection will be attempted using the `https://ifconfig.co/json` API.
# Default values for this are specified in `matrix_coturn_turn_external_ip_address_auto_detection_*` variables in the Coturn role
# (see `roles/custom/matrix-coturn/defaults/main.yml`).
#
# If your server has multiple IP addresses, you may define them in another variable which allows a list of addresses.
# Example: `matrix_coturn_turn_external_ip_addresses: ['1.2.3.4', '4.5.6.7']`
#
# matrix_coturn_turn_external_ip_address: ''

# Bridges

# Shared Secret Auth (required for double-puppeting)
matrix_synapse_ext_password_provider_shared_secret_auth_enabled: true
matrix_synapse_ext_password_provider_shared_secret_auth_shared_secret: 'mysecret'

# WhatsApp Bridge
matrix_mautrix_whatsapp_enabled: true
matrix_mautrix_whatsapp_bridge_relay_enabled: true

# Telegram Bridge
matrix_mautrix_telegram_enabled: true
matrix_mautrix_telegram_api_id: 000
matrix_mautrix_telegram_api_hash: 'xxx'

Matrix Server:

Ansible: If your problem appears to be with Ansible, tell us:

On the Matrix server itself. The server is hosted on a VM in Proxmox.

ansible [core 2.16.2] config file = /home/sati/matrix-docker-ansible-deploy/ansible.cfg configured module search path = ['/home/sati/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /home/sati/.local/pipx/venvs/ansible/lib/python3.10/site-packages/ansible ansible collection location = /home/sati/.ansible/collections:/usr/share/ansible/collections executable location = /home/sati/.local/bin/ansible python version = 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (/home/sati/.local/pipx/venvs/ansible/bin/python) jinja version = 3.1.2 libyaml = True

Problem description:

Describe what you're doing, what you expect to happen and what happens instead here. Tell us what you've tried and what you're aiming to achieve.

Client (please complete the following information):

Service failed to start with this error:

ok: [matrix.wencai.org] => (item={'name': 'matrix-traefik-certs-dumper.service', 'priority': 3500, 'groups': ['matrix', 'traefik-certs-dumper']})
failed: [matrix.wencai.org] (item={'name': 'matrix-coturn.service', 'priority': 4000, 'groups': ['matrix', 'coturn']}) => changed=false 
  ansible_loop_var: item
  item:
    groups:
    - matrix
    - coturn
    name: matrix-coturn.service
    priority: 4000
  msg: |-
    Unable to start service matrix-coturn.service: A dependency job for matrix-coturn.service failed. See 'journalctl -xe' for details.
changed: [matrix.wencai.org] => (item={'name': 'matrix-coturn-reload.timer', 'priority': 5000, 'groups': ['matrix', 'coturn']})

PLAY RECAP ***************************************************************************************************************************************
matrix.wencai.org          : ok=384  changed=11   unreachable=0    failed=1    skipped=597  rescued=0    ignored=0 

Additional context

It seems like let's-encrypt cert creation wasn't working properly with the certs-dumper.

$ sudo journalctl -xe | grep matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service
Jan 09 03:43:19 matrix systemd[1]: matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service: Main process exited, code=exited, status=1/FAILURE
░░ An ExecStart= process belonging to unit matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service has exited.
Jan 09 03:43:19 matrix systemd[1]: matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service: Failed with result 'exit-code'.
░░ The unit matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service has entered the 'failed' state with result 'exit-code'.
░░ Subject: A start job for unit matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service has failed
░░ A start job for unit matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service has finished with a failure.
Jan 09 04:13:47 matrix sudo[107855]:     sati : TTY=pts/0 ; PWD=/home/sati/matrix-docker-ansible-deploy ; USER=root ; COMMAND=/usr/bin/systemctl status matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service
Jan 09 05:11:39 matrix sudo[112667]:     sati : TTY=pts/0 ; PWD=/home/sati/matrix-docker-ansible-deploy ; USER=root ; COMMAND=/usr/bin/systemctl status matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service
$sudo systemctl status matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service
× matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service - Traefik certs dumper waiter (matrix-traefik-certs-dumper-wait-for-domain) for matrix.wencai.org
Loaded: loaded (/etc/systemd/system/matrix-traefik-certs-dumper-wait-for-domain@.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2024-01-09 03:43:19 UTC; 1h 43min ago
Main PID: 103557 (code=exited, status=1/FAILURE)
CPU: 143ms

Jan 09 03:43:18 matrix matrix-traefik-certs-dumper-matrix.wencai.org[103557]: Giving up waiting for /matrix/traefik-certs-dumper/dumped-certificates/matrix.wencai.org/certificate.crt is missing.. Waiting (180/180)...
Jan 09 03:43:19 matrix systemd[1]: matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service: Main process exited, code=exited, status=1/FAILURE
Jan 09 03:43:19 matrix systemd[1]: matrix-traefik-certs-dumper-wait-for-domain@matrix.wencai.org.service: Failed with result 'exit-code'.
Jan 09 03:43:19 matrix systemd[1]: Failed to start Traefik certs dumper waiter (matrix-traefik-certs-dumper-wait-for-domain) for matrix.wencai.org.

The missing directory was not even created.

$sudo ls -l /matrix/traefik-certs-dumper/dumped-certificates
total 4
drwxr-xr-x 2 matrix matrix 4096 Jan  9 03:42 private

matrix_ssl_base_path is not created.

roles/custom/matrix-nginx-proxy/defaults/main.yml:matrix_ssl_base_path: "{{ matrix_base_data_path }}/ssl"
sudo ls -l /matrix/ssl
ls: cannot access '/matrix/ssl': No such file or directory
ppkhoa commented 10 months ago

Are you using wildcard certs? If yes, check this out: https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/master/docs/howto-srv-server-delegation.md#adjust-coturns-configuration

jjsfatllc commented 9 months ago

Hello, I'm facing the same issue and I'm not using wildcard-certs.

spantaleev commented 9 months ago

There are many similar issues on this repository. Search and you shall find. Check Traefik logs - it's probably failing to obtain certificates for matrix.DOMAIN - most likely due to a firewall or DNS issue.