matrix-org / matrix-federation-tester

Tester for matrix federation written in golang.
77 stars 17 forks source link

A redirect loop for a server well-known file should (probably) be an error #108

Open jeffcasavant opened 4 years ago

jeffcasavant commented 4 years ago

I was helping @PaarthShah troubleshoot their synapse installation this evening. The federation tester said there was no .well-known file, and that everything was okay (the admin was using a SRV record) but Synapse would throw errors when attempting to federate:

2020-07-28 19:23:21,610 - root - 244 - WARNING - GET-1801903- Error retrieving alias
2020-07-28 19:23:21,612 - synapse.http.server - 83 - ERROR - GET-1801903- Failed handle request via 'ClientDirectoryServer': <XForwardedForRequest at 0x7f4664c53dd0 method='GET' uri='/_matrix/client/r0/directory/room/%23testaroni%3Ashahpaarth.com' clientproto='HTTP/1.0' site=8008>
Traceback (most recent call last):
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/http/server.py", line 228, in _async_render_wrapper
    callback_return = await self._async_render(request)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/http/server.py", line 399, in _async_render
    callback_return = await raw_callback_return
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/rest/client/v1/directory.py", line 52, in on_GET
    res = await dir_handler.get_association(room_alias)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/handlers/directory.py", line 241, in get_association
    ignore_backoff=True,
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/federation/transport/client.py", line 182, in make_query
    ignore_backoff=ignore_backoff,
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/http/matrixfederationclient.py", line 800, in get_json
    timeout=timeout,
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/http/matrixfederationclient.py", line 252, in _send_request_with_optional_trailing_slash
    response = yield self._send_request(request, **send_request_args)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/etc/homeserver/venv3/lib/python3.7/site-packages/synapse/http/matrixfederationclient.py", line 497, in _send_request
    raise e
synapse.api.errors.HttpResponseException: 301: b'Moved Permanently'

Turns out the server well-known file actually returned a redirect loop, and that broke federation. The federation spec in 3.1.3 says

30x redirects should be followed, however redirection loops should be avoided.

which I interpret to mean a redirect loop is not an outright error, but that homeserver implementors don't necessarily need to detect them.

I filed this bug against the federation tester rather than Synapse due to that interpretation.

It's also intuitive to me that the federation tester should not sign off on instances that the most popular homeserver won't successfully federate with - but that's more of a de-facto than explicit specification, so it's kind of a judgement call.

Notes on troubleshooting

PaarthShah commented 4 years ago

As an update, after checking a few hours later, whatever issue that got cached and stuck seems to have resolved itself, and my homeserver was able to properly federate/communicate with other matrix homeservers, including matrix.org.

PaarthShah commented 4 years ago

I would definitely agree that at least a warning ought to be thrown by the federation tester for warning about the redirect, as it might save some other poor soul a few hours of grief. I don't necessarily think that this ought to be handled by synapse itself as it was ultimately a misconfiguration on my part.