spantaleev / matrix-docker-ansible-deploy

🐳 Matrix (An open network for secure, decentralized communication) server setup using Ansible and Docker
GNU Affero General Public License v3.0
4.76k stars 1.03k forks source link

ma1sd configuration points to domain:8448 instead of matrix.domain:8448 #1125

Closed daudo closed 3 years ago

daudo commented 3 years ago

Trying to add the new go-neb integration, I stumbled upon this problem with ma1sd configuration.

Inviting the newly created go-neb user into a room first resulted in some waiting time, until a "server error" was shown in element. Checking the logs revealed the following issues:

Jun 18 15:19:37 matrix matrix-ma1sd[31301]: [XNIO-1 task-4] ERROR io.kamax.mxisd.auth.AccountManager - Unable to get user info.
Jun 18 15:19:37 matrix matrix-ma1sd[31301]: org.apache.http.conn.HttpHostConnectException: Connect to example.com:8448 [example.com/10.50.97.1] failed: Operation timed out (Connection timed out)
Jun 18 15:19:37 matrix matrix-ma1sd[31301]: #011at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
Jun 18 15:19:37 matrix matrix-ma1sd[31301]: #011at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
Jun 18 15:19:37 matrix matrix-ma1sd[31301]: #011at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
Jun 18 15:19:37 matrix matrix-ma1sd[31301]: #011at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
Jun 18 15:19:37 matrix matrix-ma1sd[31301]: #011at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
Jun 18 15:19:37 matrix matrix-ma1sd[31301]: #011at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
[...]

and eventually the following message was logged:

Jun 18 15:19:39 matrix matrix-ma1sd[31301]: [XNIO-1 task-6] INFO io.kamax.mxisd.matrix.HomeserverFederationResolver - Error while trying to lookup well-known for example.com
Jun 18 15:19:39 matrix matrix-ma1sd[31301]: [XNIO-1 task-6] INFO io.kamax.mxisd.matrix.HomeserverFederationResolver - Resolution of example.com to https://example.com:8448
Jun 18 15:19:39 matrix matrix-ma1sd[31301]: [XNIO-1 task-6] INFO io.kamax.mxisd.auth.AccountManager - Domain resolved: example.com => https://example.com:8448

Apparently, ma1sd tries to talk to https://example.com:8448 and without surprise doesn't find anything there, because the installation is listening on matrix.example.com.

As far as I can tell, I've configured everything according to the guidelines, including SRV records for every resource as laid out in https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/master/docs/configuring-dns.md

So I went ahead and looked at what ma1sd suggests for those circumstances.

The sample ma1sd config in https://github.com/ma1uta/ma1sd/blob/ae5864cd91f7db57c3a99b7847c3c327980e74e8/ma1sd.example.yaml#L18 says this:

# If the hostname of the public URL used to reach your Matrix services is different from your Matrix domain,
# per example matrix.domain.tld vs domain.tld, then use the server.name configuration option.
# See the "Configure" section of the Getting Started guide for more info.

And indeed, after adding the following tomatrix_ma1sd_configuration_extension_yaml in my inventory configuration, everything started to work:

server:
     name: 'matrix.example.com'

Maybe I am missing something, but it seems as if the current ma1sd integration is not 100% correct. As my setup is a very vanilla one, I guess this should be added per default or at least be documented accordingly.

spantaleev commented 3 years ago

Thank you for the detailed description!

Our default ma1sd configuration already configures this, as seen here: https://github.com/spantaleev/matrix-docker-ansible-deploy/blob/10fba32368583d7ff7a4481789d2c75f8fe1e9ec/roles/matrix-ma1sd/templates/ma1sd.yaml.j2#L7-L8

I wonder why yours didn't include it by default.

daudo commented 3 years ago

you are right, I'm puzzled now :) Difficult to say why it hasn't been in the configuration ... But just to be sure, I checked with some older backups and well, it was already in there as well!!!

IDK what caused the problem then and even worse, IDK what solved it finally ...

Closing this issue now, sorry for the noise!