spantaleev / matrix-docker-ansible-deploy

🐳 Matrix (An open network for secure, decentralized communication) server setup using Ansible and Docker
GNU Affero General Public License v3.0
4.78k stars 1.03k forks source link

Some Features Break When Using Own Webserver (Traefik) #3147

Closed Handrail9 closed 8 months ago

Handrail9 commented 8 months ago

Describe the bug A clear and concise description of what the bug is. When utilizing the "Own Webserver" option of the documentation, specifically the "Traefik managed by you" section, a lot of the bridges seem to break. I've tested a lot of the Mautrix bridges and none of them worked. I did a fresh install to see if it was a configuration issue and it didn't seem to be. I added one Mautrix puppet bridge (Discord) and I believe I found where the issue lies. I am filing this issue here, instead of with the bridges because the issue seems to stem from it not being able to properly find the Traefik router.

To Reproduce My vars.yml file looks like this:

# The bare domain name which represents your Matrix identity.
# Matrix user ids for your server will be of the form (`@user:<matrix-domain>`).
#
# Note: this playbook does not touch the server referenced here.
# Installation happens on another server ("matrix.<matrix-domain>").
#
# If you've deployed using the wrong domain, you'll have to run the Uninstalling step,
# because you can't change the Domain after deployment.
#
# Example value: example.com
matrix_domain: mydomain.com

# The Matrix homeserver software to install.
# See:
#  - `roles/custom/matrix-base/defaults/main.yml` for valid options
# - the `docs/configuring-playbook-IMPLEMENTATION_NAME.md` documentation page, if one is available for your implementation choice
matrix_homeserver_implementation: synapse

# A secret used as a base, for generating various other secrets.
# You can put any string here, but generating a strong one is preferred (e.g. `pwgen -s 64 1`).
matrix_homeserver_generic_secret_key: '[Randomized Password 1]'

# By default, the playbook manages its own Traefik (https://doc.traefik.io/traefik/) reverse-proxy server.
# It will retrieve SSL certificates for you on-demand and forward requests to all other components.
# For alternatives, see `docs/configuring-playbook-own-webserver.md`.
#matrix_playbook_reverse_proxy_type: playbook-managed-traefik
matrix_playbook_reverse_proxy_type: other-traefik-container

# Uncomment and adjust if your Traefik container is on another network
matrix_playbook_reverse_proxy_container_network: myNetwork

# Adjust to point to your Traefik container
matrix_playbook_reverse_proxy_hostname: traefik

devture_traefik_certs_dumper_ssl_dir_path: "/opt/traefik/cert"

# Uncomment and tweak the variable below if the name of your federation entrypoint is different
# than the default value (matrix-federation).
# matrix_federation_traefik_entrypoint_name: matrix-federation
# This is something which is provided to Let's Encrypt when retrieving SSL certificates for domains.
#
# In case SSL renewal fails at some point, you'll also get an email notification there.
#
# If you decide to use another method for managing SSL certificates (different than the default Let's Encrypt),
# you won't be required to define this variable (see `docs/configuring-playbook-ssl-certificates.md`).
#
# Example value: someone@example.com
devture_traefik_config_certificatesResolvers_acme_email: 'matrixemail@mydomain.com'

# A Postgres password to use for the superuser Postgres user (called `matrix` by default).
#
# The playbook creates additional Postgres users and databases (one for each enabled service)
# using this superuser account.
devture_postgres_connection_password: '[Randomized Password 2]'

# By default, we configure Coturn's external IP address using the value specified for `ansible_host` in your `inventory/hosts` file.
# If this value is an external IP address, you can skip this section.
#
# If `ansible_host` is not the server's external IP address, you have 2 choices:
# 1. Uncomment the line below, to allow IP address auto-detection to happen (more on this below)
# 2. Uncomment and adjust the line below to specify an IP address manually
#
# By default, auto-detection will be attempted using the `https://ifconfig.co/json` API.
# Default values for this are specified in `matrix_coturn_turn_external_ip_address_auto_detection_*` variables in the Coturn role
# (see `roles/custom/matrix-coturn/defaults/main.yml`).
#
# If your server has multiple IP addresses, you may define them in another variable which allows a list of addresses.
# Example: `matrix_coturn_turn_external_ip_addresses: ['1.2.3.4', '4.5.6.7']`
#
# matrix_coturn_turn_external_ip_address: ''
exim_relay_sender_address: "matrixemail@mydomain.com"
exim_relay_relay_use: true
exim_relay_relay_host_name: "email.mydomain.com"
exim_relay_relay_host_port: 587
exim_relay_relay_auth: true
exim_relay_relay_auth_username: "matrixemail@mydomain.com"
exim_relay_relay_auth_password: "[Randomized Password 3]"
matrix_bot_matrix_registration_bot_enabled: true

# By default, the playbook will set use the bot with a username like this: `@bot.matrix-registration-bot:DOMAIN`.
# To use a different username, uncomment & adjust the variable below:
# matrix_bot_matrix_registration_bot_matrix_user_id_localpart: bot.matrix-registration-bot

# Generate a strong password here. Consider generating it with `pwgen -s 64 1`
matrix_bot_matrix_registration_bot_bot_password: [Randomized Password 4]

# Enables registration
matrix_synapse_enable_registration: true

# Restrict registration to users with a token
matrix_synapse_registration_requires_token: true
matrix_synapse_admin_enabled: true
matrix_synapse_ext_password_provider_shared_secret_auth_enabled: true
matrix_synapse_ext_password_provider_shared_secret_auth_shared_secret: [Randomized Password 5]
matrix_mautrix_discord_enabled: true

yaml Paste your vars.yml file here. Make sure to remove any secret values before posting your vars.yml file publicly.

Starting with a new config, use your own traefik container, most bots and bridges don't work

Expected behavior A clear and concise description of what you expected to happen. The bots work Matrix Server:

Client:

Additional context Add any other context about the problem here. Here is the output of journalctl -fu matrix-mautrix-discord.service

Feb 02 10:02:05 fedora matrix-mautrix-discord[2101427]: Feb  2, 2024 05:59:05 WRN Request failed, retrying error="Get \"http://traefik:8008/_matrix/client/versions?user_id=%40discordbot%3Amydomain.com\": dial tcp 172.18.0.70:8008: connect: connection refused" as_user_id=@discordbot:mydomain.com req_id=119 retry_in_seconds=4
Feb 02 10:02:09 fedora matrix-mautrix-discord[2101427]: Feb  2, 2024 05:59:09 WRN Request failed, retrying error="Get \"http://traefik:8008/_matrix/client/versions?user_id=%40discordbot%3Amydomain.com\": dial tcp 172.18.0.70:8008: connect: connection refused" as_user_id=@discordbot:mydomain.com req_id=119 retry_in_seconds=8
Feb 02 10:02:17 fedora matrix-mautrix-discord[2101427]: Feb  2, 2024 05:59:17 WRN Request failed, retrying error="Get \"http://traefik:8008/_matrix/client/versions?user_id=%40discordbot%3Amydomain.com\": dial tcp 172.18.0.70:8008: connect: connection refused" as_user_id=@discordbot:mydomain.com req_id=119 retry_in_seconds=16
Feb 02 10:02:33 fedora matrix-mautrix-discord[2101427]: Feb  2, 2024 05:59:33 WRN Request failed, retrying error="Get \"http://traefik:8008/_matrix/client/versions?user_id=%40discordbot%3Amydomain.com\": dial tcp 172.18.0.70:8008: connect: connection refused" as_user_id=@discordbot:mydomain.com req_id=119 retry_in_seconds=32
Feb 02 10:03:05 fedora matrix-mautrix-discord[2101427]: Feb  2, 2024 06:00:05 ERR Request failed error="request error: Get \"http://traefik:8008/_matrix/client/versions?user_id=%40discordbot%3Amydomain.com\": dial tcp 172.18.0.70:8008: connect: connection refused" as_user_id=@discordbot:mydomain.com duration=2.417464 method=GET req_id=119 url=http://traefik:8008/_matrix/client/versions?user_id=%40discordbot%3Amydomain.com

It looks like it has something to do with the bot looking for traefik port 8008 for matrix, instead of 8448 or 443. (I don't know much but I know matrix uses those two ports, and I think 8008 is used at some point but not in some use cases?) I had tried using this playbook a few months back and this wasn't an issue, but I did have a different issue at that time with the users database not allowing new users, and I finally got around to having free time to work on my homelab and now there's this issue lol. Any help with a quick fix for this is greatly appreciated.

spantaleev commented 8 months ago

Check what /etc/systemd/system/matrix-mautrix-discord.service looks like.

Given that you've configured:

matrix_playbook_reverse_proxy_type: other-traefik-container
matrix_playbook_reverse_proxy_container_network: myNetwork
matrix_playbook_reverse_proxy_hostname: traefik

You should see it being connected to the myNetwork network via docker network connect. Looking at your logs, we can see that it's talking to http://traefik:8008 as expected.


What you're probably missing is not following the new step for configuring the new matrix-internal-matrix-client-api entrypoint on yout own Traefik instance. See the following references:

Handrail9 commented 8 months ago

Check what /etc/systemd/system/matrix-mautrix-discord.service looks like.

Given that you've configured:

matrix_playbook_reverse_proxy_type: other-traefik-container
matrix_playbook_reverse_proxy_container_network: myNetwork
matrix_playbook_reverse_proxy_hostname: traefik

You should see it being connected to the myNetwork network via docker network connect. Looking at your logs, we can see that it's talking to http://traefik:8008 as expected.

What you're probably missing is not following the new step for configuring the new matrix-internal-matrix-client-api entrypoint on yout own Traefik instance. See the following references:

* The [Traefik now has an extra job](https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/1b5cbf24c362471bd7d0a96870ec5fe5a4a03b97/CHANGELOG.md#traefik-now-has-an-extra-job) changelog entry

* The [People managing their own Traefik instance need to do minor changes](https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/1b5cbf24c362471bd7d0a96870ec5fe5a4a03b97/CHANGELOG.md#people-managing-their-own-traefik-instance-need-to-do-minor-changes) changelog entry

* The updated [Traefik managed by you](https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/master/docs/configuring-playbook-own-webserver.md#traefik-managed-by-you) section in `docs /configuring-playbook-own-webserver.md`

Oh jeez, you're right. I've apparently had the old tab open and cached so long I didn't even notice there was a new change. My apologies!! Thank you for letting me know.