element-hq / element-android

A Matrix collaboration client for Android.
https://element.io/
Apache License 2.0
3.28k stars 677 forks source link

Discovery order for homeserver causes "network errors" #3118

Open ThoreKr opened 3 years ago

ThoreKr commented 3 years ago

Describe the bug

When trying to set up element-android with my homeserver via the "other" functionality using the base url always yielded a "No network. Please check your internet connection." However, when testing with another homeserver I got to the login prompt. Also in the webserver logs I could see a request to GET /_matrix/client/versions HTTP/2.0. For reference (and to check I'm not insane I also tested my homeserver against app.element.io. The discovery was successful.

After looking at the request and the responses in android studio and comparing it to the other homeserver which did connect successfully I noted this difference:

V/FormattedJsonHttpLogger: --> GET https://other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 404 https://www.other-home.server/_matrix/client/versions (368ms, unknown-length body)
W/RetrofitExtensionsKt: The error returned by the server is not a MatrixError
E/Request: Exception when executing request GET https://other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: --> GET https://other-home.server/config.other-home.server.json
V/FormattedJsonHttpLogger: <-- 404 https://www.other-home.server/config.other-home.server.json (220ms, unknown-length body)
W/RetrofitExtensionsKt: The error returned by the server is not a MatrixError
E/Request: Exception when executing request GET https://other-home.server/config.other-home.server.json
V/FormattedJsonHttpLogger: --> GET https://other-home.server/config.json
V/FormattedJsonHttpLogger: <-- 404 https://www.other-home.server/config.json (214ms, unknown-length body)
W/RetrofitExtensionsKt: The error returned by the server is not a MatrixError
E/Request: Exception when executing request GET https://other-home.server/config.json
V/FormattedJsonHttpLogger: --> GET https://other-home.server/.well-known/matrix/client
V/FormattedJsonHttpLogger: <-- 200 https://other-home.server/.well-known/matrix/client (96ms, 81-byte body)
V/FormattedJsonHttpLogger: --> GET https://matrix.other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 200 https://matrix.other-home.server/_matrix/client/versions (216ms, unknown-length body)
V/FormattedJsonHttpLogger: --> GET https://matrix.other-home.server/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 200 https://matrix.other-home.server/_matrix/client/versions (159ms, unknown-length body)
V/FormattedJsonHttpLogger: --> GET https://matrix.other-home.server/_matrix/client/r0/login
V/FormattedJsonHttpLogger: <-- 200 https://matrix.other-home.server/_matrix/client/r0/login (27ms, unknown-length body)

In comparison my failing connection looked like this

V/FormattedJsonHttpLogger: --> GET https://MYDOMAIN/_matrix/client/versions
V/FormattedJsonHttpLogger: <-- 200 https://other.MYDOMAIN/ (374ms, unknown-length body)
E/Request: Exception when executing request GET https://MYDOMAIN/_matrix/client/versions

This was related to the configuration of the webserver which redirected all except for the well-kown lookup to other.

Apparently this crashes the entire lookup.

The workaround was to effectively blackhole the other lookup urls and return 404 instead.

To Reproduce

  1. Set up a homeserver on some subdomain and a base_url different to that subdomain
  2. In the reverse proxy configure the base domain server
    1. create a location for the well-known endpoint as described in the Installation guide
    2. Forward all other requests somewhere else
  3. try to configure element-android for the new homeserver

Expected behavior

Element should discover the homeserver base url from /.well-known/matrix/client as app.element.io when the reverse proxy for the base url only serves the `well-known endpoint and forwards all other requests somewhere else.

Smartphone (please complete the following information):

Additional context

asimons04 commented 3 years ago

Can confirm same behavior on my installation and devices.

My root webserver virtual host (example.org) is set to serve the .well-known/matrix/server and .well-known/matrix/client endpoints as static JSON returns in the Nginx config.

My Synapse server runs at https://matrix.example.org, and is reverse-proxied to a backend server separately from the root webserver (though they are on the same host, they use different virtual hosts in Nginx).

When setting up the app, entering example.org as the homeserver should look for the well-known value and use that. However, looking at the access logs for example.org, there is never a call to example.org/.well-known/matrix/client and the first request is actually to /_matrix/client/versions. It looks like Element is not using the well-known client config at all.

ThoreKr commented 3 years ago

@asimons04 It is called if a series of conditions hold:

https://github.com/vector-im/element-android/blob/develop/matrix-sdk-android/src/main/java/org/matrix/android/sdk/internal/auth/DefaultAuthenticationService.kt#L244

The discovery order is to look for a matrix server, then the element config in two different places and then finally the well known. However the next one is only checked if it couldn't be found and returned a 404. For convenience this feature is nowhere documented except for one small mention that apparently users might type in the url of the webchat.

Due to the way this is chained i couldn't figure out how to catch this since it will follow redirects and fail internally, so it can't continue to process the chain. That also makes it surprisingly difficult to change the lookup order or just fallback to that.

This is my nginx workaround:

  location ~ ((/_matrix.*)|(config.*json)) {
    return 404;
  }
asimons04 commented 3 years ago

Thanks for the follow-up. I tried adding that custom location you sent to my root virtual host, but it didn't seem to make any difference. I'm not too concerned as it's easy enough for friends and family to use matrix.mydomain.org instead of just mydomain.org when setting up Element. Just thought if I could make it easier, I would.

JesseKPhillips commented 2 months ago

In my case I set _matrix/ and _synapse/ to reverse proxy to 8008 of the docker instance.

ThoreKr commented 2 months ago

In my case I set _matrix/ and _synapse/ to reverse proxy to 8008 of the docker instance.

This is not ideal, as it circumvents delegation and will cause all client traffic to be proxied by the host that is running the root domain. If it is the same host that may be acceptable, if you want to separate them later you'll run into problems.