element-hq / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://element-hq.github.io/synapse
GNU Affero General Public License v3.0
1.5k stars 185 forks source link

Error at OIDC initialization - restart in loop #17637

Open 3isenHeiM opened 2 months ago

3isenHeiM commented 2 months ago

Description

I've configured my OIDC server (Authentik) for Synapse using the official guide.

My homeserver.yml is populated as this (the URL of the matrix server is matrix.tld):

oidc_providers:
    - idp_id: authentik
      idp_name: authentik
      # discover: true
      skip_verification: true
      issuer: "https://authentik.tld/application/o/matrix/"
      client_id: <redacted>
      client_secret: <redacted>
      authorization_endpoint: "https://authentik.tld/application/o/matrix/"
      token_endpoint: "https://authentik.tld/application/o/token/"
      userinfo_endpoint: "https://authentik.tld/application/o/userinfo/"
      jwks_uri: "https://authentik.tld/application/o/matrix/jwks/"
      scopes:
          - "openid"
          - "profile"
          - "email"
      user_mapping_provider:
          config:
              localpart_template: "{{ user.preferred_username }}"
              display_name_template: "{{ user.name|capitalize }}"

However, at the docker compose up, the synapse server restarts in loop because it can't initialize the provider. It seems it reaches a timeout, from the logs, although it doesn't even wait half a second and directly restarts.

Here are the logs:

matrix-server exited with code 0
matrix-server   | Starting synapse with args -m synapse.app.homeserver --config-path /data/homeserver.yaml
matrix-server   | Error during startup:
matrix-server   | Traceback (most recent call last):
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/handlers/oidc.py", line 148, in load_metadata
matrix-server   |     await p.load_metadata()
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/handlers/oidc.py", line 610, in load_metadata
matrix-server   |     return await self._provider_metadata.get()
matrix-server   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/util/caches/cached_call.py", line 143, in get
matrix-server   |     return await self._cachedcall.get()
matrix-server   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/util/caches/cached_call.py", line 113, in get
matrix-server   |     self._result.raiseException()
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/twisted/python/failure.py", line 505, in raiseException
matrix-server   |     raise self.value.with_traceback(self.tb)
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/twisted/internet/defer.py", line 1999, in _inlineCallbacks
matrix-server   |     result = context.run(
matrix-server   |              ^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/twisted/python/failure.py", line 519, in throwExceptionIntoGenerator
matrix-server   |     return g.throw(self.value.with_traceback(self.tb))
matrix-server   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/util/caches/cached_call.py", line 133, in _wrapper
matrix-server   |     return await f()
matrix-server   |            ^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/handlers/oidc.py", line 622, in _load_metadata
matrix-server   |     metadata_response = await self._http_client.get_json(url)
matrix-server   |                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/http/client.py", line 571, in get_json
matrix-server   |     body = await self.get_raw(uri, args, headers=actual_headers)
matrix-server   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/http/client.py", line 656, in get_raw
matrix-server   |     response = await self.request("GET", uri, headers=Headers(actual_headers))
matrix-server   |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/http/client.py", line 423, in request
matrix-server   |     response = await make_deferred_yieldable(request_deferred)
matrix-server   |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/twisted/internet/defer.py", line 1078, in _runCallbacks
matrix-server   |     current.result = callback(  # type: ignore[misc]
matrix-server   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/http/client.py", line 964, in _timeout_to_request_timed_out_error
matrix-server   |     raise RequestTimedOutError("Timeout waiting for response from remote server")
matrix-server   | synapse.http.RequestTimedOutError: 504: Timeout waiting for response from remote server
matrix-server   | 
matrix-server   | The above exception was the direct cause of the following exception:
matrix-server   | 
matrix-server   | Traceback (most recent call last):
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/app/_base.py", line 259, in wrapper
matrix-server   |     await cb(*args, **kwargs)
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/app/homeserver.py", line 368, in start
matrix-server   |     await oidc.load_metadata()
matrix-server   |   File "/usr/local/lib/python3.11/site-packages/synapse/handlers/oidc.py", line 152, in load_metadata
matrix-server   |     raise Exception(
matrix-server   | Exception: Error while initialising OIDC provider 'oidc-authentik'

But I can browse to the Element homepage, and then when clicking on "Sign in", I get these log lines:

matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /i18n/languages.241c278.json HTTP/1.1" 200 904 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /_matrix/client/versions HTTP/1.1" 404 555 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 2024/08/30 13:08:58 [error] 29#29: *6 open() "/usr/share/nginx/html/_matrix/client/versions" failed (2: No such file or directory), client: 172.18.0.5, server: localhost, request: "GET /_matrix/client/versions HTTP/1.1", host: "matrix.tld"
matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /img/element-icons/i.cf33278.svg HTTP/1.1" 200 516 "https://matrix.tld/bundles/e9035f460125f26b7b0e/theme-light.css" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /_matrix/client/v3/login HTTP/1.1" 404 555 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 2024/08/30 13:08:58 [error] 29#29: *6 open() "/usr/share/nginx/html/_matrix/client/v3/login" failed (2: No such file or directory), client: 172.18.0.5, server: localhost, request: "GET /_matrix/client/v3/login HTTP/1.1", host: "matrix.tld"

I have this page:

image

I don't know why is there a timeout. My OIDC server is using a self-signed certificate, but I've patched the Dockerfile to add the CA to the trusted store.

How could I further debug this ?

Thanks

Steps to reproduce

Homeserver

self-hosted, not federated

Synapse Version

1.113.0

Installation Method

Docker (matrixdotorg/synapse)

Database

PostgreSQL

Workers

Single process

Platform

Dockerized with docker compose.

Configuration

No response

Relevant log output

matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /i18n/languages.241c278.json HTTP/1.1" 200 904 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /_matrix/client/versions HTTP/1.1" 404 555 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 2024/08/30 13:08:58 [error] 29#29: *6 open() "/usr/share/nginx/html/_matrix/client/versions" failed (2: No such file or directory), client: 172.18.0.5, server: localhost, request: "GET /_matrix/client/versions HTTP/1.1", host: "matrix.tld"
matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /img/element-icons/i.cf33278.svg HTTP/1.1" 200 516 "https://matrix.tld/bundles/e9035f460125f26b7b0e/theme-light.css" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 172.18.0.5 - - [30/Aug/2024:13:08:58 +0000] "GET /_matrix/client/v3/login HTTP/1.1" 404 555 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36" "172.18.0.1"
matrix-element  | 2024/08/30 13:08:58 [error] 29#29: *6 open() "/usr/share/nginx/html/_matrix/client/v3/login" failed (2: No such file or directory), client: 172.18.0.5, server: localhost, request: "GET /_matrix/client/v3/login HTTP/1.1", host: "matrix.tld"

Anything else that would be useful to know?

No response

3isenHeiM commented 1 month ago

EDIT : I've tried with let's encrypt certificates and the issue is still present.

3isenHeiM commented 1 month ago

I've modified the code to avoid raising a RequestTimedOutErrors anytime there is an error to catch, because then it masks the real error.

Commenting the line https://github.com/element-hq/synapse/blob/v1.114.0/synapse/http/client.py#L933

request_deferred.addErrback(_timeout_to_request_timed_out_error)

And re-running the server allowed me to get the true error:

twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'tlsv1 alert internal error')]>]

Here are the ciphers offered by my IdP server (thanks this comment):

nmap --script ssl-enum-ciphers -p 443 auth.rtss.ninja
Starting Nmap 7.95 ( https://nmap.org ) at 2024-09-06 13:21 CEST
Nmap scan report for auth.rtss.ninja (172.18.0.1)
Host is up (0.00010s latency).

PORT    STATE SERVICE
443/tcp open  https
| ssl-enum-ciphers: 
|   TLSv1.2: 
|     ciphers: 
|       TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 (secp256r1) - A
|       TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (secp256r1) - A
|       TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 (secp256r1) - A
|     compressors: 
|       NULL
|     cipher preference: client
|   TLSv1.3: 
|     ciphers: 
|       TLS_AKE_WITH_AES_128_GCM_SHA256 (ecdh_x25519) - A
|       TLS_AKE_WITH_AES_256_GCM_SHA384 (ecdh_x25519) - A
|       TLS_AKE_WITH_CHACHA20_POLY1305_SHA256 (ecdh_x25519) - A
|     cipher preference: server
|_  least strength: A

Nmap done: 1 IP address (1 host up) scanned in 0.37 seconds
daedric7 commented 1 month ago

Can you confirm the length of the client secret ?

3isenHeiM commented 1 month ago

The client secret is 32 hex chars long.

daedric7 commented 1 month ago

The client secret is 32 hex chars long.

Increase it to at least 128 chars long.

3isenHeiM commented 1 month ago

My mistake, it was 64 chars long. I generate them using openssl rand -hex 32.

Will try, thanks for the hint