matrix-org / matrix-spec

The Matrix protocol specification
Apache License 2.0
194 stars 96 forks source link

Autodiscovery process is underspecified, does not mention direct hostname connections #534

Open joepie91 opened 5 years ago

joepie91 commented 5 years ago

The Server Discovery section in the Client-Server specification states that clients...

SHOULD use an auto-discovery mechanism to determine the server's URL based on a user's Matrix ID.

However, there are two underspecified things in the section:

  1. Does this also apply when the username and the HS hostname are entered by the user separately? For the sake of consistent behaviour, I'd argue that it probably should, but the specification currently scopes it to hostname extraction from Matrix IDs only.
  2. Where does a 'direct attempt' fit into the picture? For example, Feneas does not have a .well-known for Matrix, and clients are presumably able to still connect to Feneas because they literally attempt the specified feneas.org hostname, bypassing the .well-known process entirely.

The problem with the second point is particularly that it isn't clear how a 'direct attempt' relates to a .well-known-based attempt, ordering-wise; presumably clients are doing direct attempts out-of-spec, or Feneas would've already had complaints about their missing .well-known.

But are those clients doing that before or after the .well-known attempt? For example, a situation could exist where a Matrix server A on example-a.com has a .well-known that points at a Matrix server B on example-b.com, in which case different clients attempting to log into example-a.com would connect to different Matrix servers (A or B) depending on whether they try the literal hostname or the .well-known first.

Therefore, personally, I think that "literal hostname attempt" should be added to the list of autodiscovery methods in a well-defined position (relative to .well-known), and that it should apply consistently regardless of whether the 'base hostname' was extracted from a Matrix ID or specified explicitly.

turt2live commented 5 years ago

How clients connect outside of an autodiscovery process is largely out of scope for the spec. As you've mentioned, the spec doesn't require that clients use an autodiscovery mechanism - it's also intentionally limited to Matrix IDs to prevent ambiguity about when to apply it.

There's nothing stopping clients from having their own autodiscovery based on the one in the spec. The spec does support multiple kinds of autodiscovery, and clients could have a text box saying "Enter your homeserver name" with a button that discovers the URLs.

I don't believe that mandating how clients connect to homeservers is a responsibility of the spec: we define the error interface, API surface, and preferred transport in an effort to make it clear that there are standards in place but ultimately "speaking matrix" is up to the implementation. We describe an autodiscovery function to help aid clients in getting connected, but don't require them to use it.

Opening a MSC to make the existing autodiscovery more generic would be better than an issue.

joepie91 commented 5 years ago

How clients connect outside of an autodiscovery process is largely out of scope for the spec. As you've mentioned, the spec doesn't require that clients use an autodiscovery mechanism - it's also intentionally limited to Matrix IDs to prevent ambiguity about when to apply it.

There's nothing stopping clients from having their own autodiscovery based on the one in the spec. The spec does support multiple kinds of autodiscovery, and clients could have a text box saying "Enter your homeserver name" with a button that discovers the URLs.

I don't believe that mandating how clients connect to homeservers is a responsibility of the spec: we define the error interface, API surface, and preferred transport in an effort to make it clear that there are standards in place but ultimately "speaking matrix" is up to the implementation. We describe an autodiscovery function to help aid clients in getting connected, but don't require them to use it.

That seems like an odd place to draw this line, to me. I would expect the scope of the specification to be essentially "ensuring that all the baseline protocol functionality works and interoperates consistently across clients", and the authentication process seems like very much a part of baseline functionality to me.

There also really are interoperability concerns here, since different clients will behave differently on the same Matrix HS if they have different discovery mechanisms; this would (over time) translate into an inconsistent user experience, where eg. some might clients work with some Matrix servers, but not with others, which only work with other Matrix servers, because of a mismatch in discovery mechanisms.

There are already some issues along these lines occurring; in my testing yesterday, I've run across two different homeservers (where the server name does not match the HS hostname) which have a .well-known/matrix/server but not a .well-known/matrix/client, and which only worked because of a Riot instance with a preconfigured HS hostname. Any external client would not work with these homeservers, without manual configuration.

Opening a MSC to make the existing autodiscovery more generic would be better than an issue.

I'm not sure what you mean with "more generic" in this context, but AIUI, an MSC would require a real-world implementation to be considered which, while understandable and probably a good constraint, means I don't currently have the time to commit to opening an MSC.

turt2live commented 5 years ago

I'm drawing the line at saying that clients need to support fields for entering a custom homeserver URL. We should be making sure the spec fits what is realistically used and wanted by implementations (while also considering justification when applicable), so modifying the existing discovery mechanism to drop the "only MXIDs" part would be a good MSC.

uhoreg commented 4 years ago

FWIW, feneas does have a .well-known file. Your link 404s because it has an extra slash at the end.

joepie91 commented 4 years ago

@uhoreg I'm fairly certain that it didn't exist at the time, not without the slash either.

The trailing slash in my original post most likely came to exist because I tend to copy the final URL from the address bar after loading it (so that things like https:// etc. are effectively 'autofilled' for me), and the feneas.org webserver is set up to redirect /non/existent/path to /non/existent/path/.

turt2live commented 4 years ago

(feneas has done some re-architecting of their setup in the last couple months)