spantaleev / matrix-docker-ansible-deploy

🐳 Matrix (An open network for secure, decentralized communication) server setup using Ansible and Docker
GNU Affero General Public License v3.0
4.76k stars 1.03k forks source link

matrix-dendrite.service service fails to start when whatsapp bridge is enabled #3199

Closed inknos closed 5 months ago

inknos commented 7 months ago

Describe the bug When running just setup-all the matrix-dendrite container fails to start if matrix whatsapp bridge is enabled.

To Reproduce My vars.yml file looks like this:

---
matrix_domain: xxx.xxx

matrix_homeserver_implementation: dendrite
matrix_dendrite_media_store_path: /mnt/nas/dendrite-media

matrix_static_files_container_labels_base_domain_enabled: true

matrix_homeserver_generic_secret_key: 'xxx'

matrix_playbook_reverse_proxy_type: playbook-managed-traefik

devture_traefik_config_certificatesResolvers_acme_email: 'aaa@bbb.c'

devture_postgres_connection_password: 'xxx'

ntfy_enabled: true

matrix_synapse_ext_password_provider_shared_secret_auth_enabled: true
matrix_synapse_ext_password_provider_shared_secret_auth_shared_secret: 'xxx'

matrix_sliding_sync_enabled: true

matrix_mautrix_whatsapp_enabled: true

Use the vars.yml, run just setup-all

failed: [matrix.snag.social] (item={'name': 'matrix-dendrite.service', 'priority': 1000, 'groups': ['matrix', 'homeservers', 'dendrite']}) => changed=false                                                                                                                                               
  ansible_loop_var: item                                                                                                                             
  item:                                                                                                                                              
    groups:                                                                                                                                              - matrix                                                                                                                                         
    - homeservers                                                                                                                                    
    - dendrite                                                                                                                                           name: matrix-dendrite.service                                                                                                                    
    priority: 1000                                                                                                                                   
  msg: |-                                                                                                                                            
    Unable to start service matrix-dendrite.service: Job for matrix-dendrite.service failed because the control process exited with error code.      
    See "systemctl status matrix-dendrite.service" and "journalctl -xe" for details.

Expected behavior All should be up and running

Matrix Server:

Playbook version

# git log
commit ac24b9f20db8ff211590a5f09fe15d460e76a522 (origin/master, origin/HEAD)
Merge: 3d337dc1 c375d888
Author: Slavi Pantaleev <slavi@devture.com>
Date:   Thu Feb 22 09:13:16 2024 +0200

    Merge pull request #3197 from spantaleev/renovate/halfshot-matrix-hookshot-5.x

    chore(deps): update halfshot/matrix-hookshot docker tag to v5.2.1

Additional context

Investigation. I changed few lines in the file /etc/systemd/system/matrix-dendrite.service to preserve the container.

...
ExecStartPre=/usr/bin/env docker create \
#                        --rm \
                        --name=matrix-dendrite \
#                        --log-driver=none \
                        --user=997:1001 \
                        --cap-drop=ALL \
...

Then I can see this in the logs

# docker logs matrix-dendrite                                                                                 
time="2024-02-22T12:51:42Z" level=fatal msg="Invalid config file: open /matrix-mautrix-whatsapp-registration.yaml: no such file or directory"        
coxde commented 6 months ago

For me it also happens to meta-instagram and telegram bridge.

I've tried to add --mount type=bind,src=/matrix/mautrix-whatsapp/config/registration.yaml,dst=/matrix-mautrix-whatsapp-registration.yaml,ro to matrix-dendrite.service and it could run but the bridge couldn't connect to dendrite.

array-in-a-matrix commented 5 months ago

matrix-mautrix-*-registration.yaml, files are not being created anywhere on the server. This probably affects all bridges, matrix-*-*-registration.yaml,.

vale981 commented 5 months ago

the bind mount arguments seem to be specified in matrix_homeserver_container_extra_arguments_auto which is never used anywhere...

patching (for dendrite)

diff --git a/roles/custom/matrix-dendrite/defaults/main.yml b/roles/custom/matrix-dendrite/defaults/main.yml
index 944d6485..d4026beb 100644
--- a/roles/custom/matrix-dendrite/defaults/main.yml
+++ b/roles/custom/matrix-dendrite/defaults/main.yml
@@ -167,7 +167,7 @@ matrix_dendrite_container_extra_arguments_auto: []
 # matrix_dendrite_container_arguments holds the final list of extra arguments to pass to the container.
 # You're not meant to override this variable.
 # If you'd like to inject your own arguments, see `matrix_dendrite_container_extra_arguments`.
-matrix_dendrite_container_arguments: "{{ matrix_dendrite_container_extra_arguments + matrix_dendrite_container_extra_arguments_auto }}"
+matrix_dendrite_container_arguments: "{{ matrix_dendrite_container_extra_arguments + matrix_dendrite_container_extra_arguments_auto + matrix_homeserver_container_extra_arguments_auto }}"

 # A list of extra arguments to pass to the container process (`dendrite-monolith` command)
 # Example:

seems to help

EDIT: now the whatsapp bridge complains about not being able to talk to the homeserver

i suspect it has something todo with the traefik migration

array-in-a-matrix commented 5 months ago

@spantaleev The issue has not been fully fixed. Dendrite now starts up and works however the bridges fail to work. The mautrix whatsapp bridge fails to start as it can not connect to the homeserver. The mautrix instagram bridge continuously crashes due to a missing path, 404 error. Other bridges may have similar problems, the issue should be reopened.

vale981 commented 5 months ago

curl http://matrix-treafik:8008/_matrix in the bridge container gives 404.

When changing matrix-traefik to matrix-dendrite in the bridge config the telegram and WhatsApp bridges connect but their bots won't accept invitations. The signal bridge keeps failing, complaining that the server does not support the matrix 1.4 spec.

array-in-a-matrix commented 5 months ago

The signal bridge keeps failing, complaining that the server does not support the matrix 1.4 spec.

The signal bridge could be failing due to dendrite not fully supporting the application-services API. It also should fail for another reason if all the other bridges don't work because of a misconfiguration in group_vars/matrix_servers or maybe a dendrite or traefik related config file.

vale981 commented 5 months ago

Hmm. this seems to be a known issue and there is a workaround: https://blog.troed.se/2024/02/19/mautrix-signal-bridge-dropping-dendrite/

I don't know if it's worth implementing though. Maybe I'll just switch to synapse.

Edit: the latest dendrite release seems to have a fix https://github.com/matrix-org/dendrite/releases/tag/v0.13.7 which has already made it in here πŸŽ‰

array-in-a-matrix commented 5 months ago

Hmm. this seems to be a known issue and there is a workaround: https://blog.troed.se/2024/02/19/mautrix-signal-bridge-dropping-dendrite/

I don't know if it's worth implementing though. Maybe I'll just switch to synapse.

Edit: the latest synapse release seems to have a fix https://github.com/matrix-org/dendrite/releases/tag/v0.13.7 which has already made it in here πŸŽ‰

I think this only fixes the signal bridge. The instagram and whatsapp bridges keeps failing, maybe an upstream patch is needed? If so, then downgrading would be a temporary fix.

@vale981 If your instance has only a single user or is new I recommend you use synapse instead.

vale981 commented 5 months ago

Well the underlying issue with traefik is not yet fixed. See also #3262. Yeah, I think i'll just refresh the whole server and use synapse.