matrix-org / matrix-spec-proposals

Proposals for changes to the matrix specification
Apache License 2.0
1.01k stars 379 forks source link

Support for discovering API endpoints via .well-known URIs (SPEC-121) #433

Closed matrixbot closed 6 years ago

matrixbot commented 9 years ago

Documentation: https://docs.google.com/document/d/1OdEj06qA7diURofyonIMgTR3fB_pWf12Txye41qd-U4/edit, https://docs.google.com/document/d/1vF-uWlUYmf1Xo161m871H1upJbwiIPeikWGWzaE_lrU/edit# Author: @maxidor, others Shepherd: @uhoreg PRs: matrix-org/matrix-spec-proposals#1359

We have several reasons why we might want to use .well-known URIs to discover API endpoints:

See also SYWEB-224 and SYN-167

We should just get on and do it. Unsure whether SRV should trump .well-known URIs or not for server-server traffic.

(Imported from https://matrix.org/jira/browse/SPEC-121)

(Reported by @ara4n)

matrixbot commented 9 years ago

Jira watchers: @ara4n

matrixbot commented 9 years ago

Links exported from Jira:

relates to https://github.com/matrix-org/matrix-doc/issues/434

matrixbot commented 9 years ago

Some potential considerations regarding SSL SNI (server name indicators):

eternaleye (IRC) M-Erik: ISTR there having been a problematic interaction in the past between SRV and SNI with how Matrix used them, which showed up when one person using CloudFlare tried to set it up
Jul 6 22:45 M-Erik: And there was a discussion re: using .well-known/ JSON to do what is currently being done with SRV Jul 6 22:46 M-Erik: Did that wind up going anywhere?

...

[02:32] [01:42:58] Arathorn: IIRC, the issue with CloudFlare was that SNI + SRV sends SNI for the name the SRV record is on, while the cert was for the name it points to [02:32] [01:44:30] Arathorn: The relevant RFC lays out that the SNI being for the domain the record is on is the correct one: https://tools.ietf.org/html/rfc6125#section-6 [02:32] [01:44:32] Title: RFC 6125 - Representation and Verification of Domain-Based Application Service Identity within Internet Public Key Infrastructure Using X.509 (PKIX) Certificates in the Context of Transport Layer Security (TLS) (at tools.ietf.org) [02:32] [01:45:56] Arathorn: So _matrix._tcp.matrix.org would use an SNI value of "matrix.org", even if the SRV record points to "blargfoo.org" [02:32] [01:47:52] Arathorn: https://github.com/matrix-org/matrix-doc/issues/433 came out of that discussion, but I dunno if a ticket got filed for that actual issue [02:32] M-NEBot/#matrix [01:48:06] https://matrix.org/jira/browse/SPEC-121 : Support for discovering API endpoints via .well-known URIs [Pending Triage,P2,reporter=Matthew Hodgson,assignee=] [02:32] [01:48:38] Arathorn: If well-known is used, though, that "resolution" happens in the application level, and SNI will almost certainly be "blargfoo.org" there. [02:32] [01:48:47] Arathorn: Not sure what's better, though [02:32] [01:49:46] Arathorn: The current behavior prevents some common things on existing sites from coexisting with Matrix, while the latter behavior would make running multiple Matrix services on one port infeasible. [02:32] [01:50:11] Arathorn: I'll also note that CNAME behaves similarly to SRV here - the "from" is in the SNI, not the "to"

-- @ara4n

madduck commented 7 years ago

I was asked in vector-im/riot-android#727 to chime in here. This is a high-priority feature we should add ASAP, as I think it greatly impacts user experience, and currently in negative ways:

When I try to log in to Riot-Android (F-Droid version here), I am asked for my "email address or username". There's an option to specify custom server options, but it's labelled advanced and I should not need to touch it.

However, when I try to log in as madduck@madduck.net with the correct password, I am told the password is wrong. This is probably because Riot tries to log me in to the default matrix.org homeserver with the username madduck@madduck.net, when in fact I expect it to do a SRV lookup on the domain to figure out the homeserver:

% dig +short SRV _matrix._tcp.madduck.net
0 0 8448 matrix.madduck.net.

I also tried @madduck:madduck.net, but that also fails.

If I enter https://matrix.madduck.net:8448/ manually into the advanced server field and change the username to just madduck, then login works. But this is not what users expect, especially not those who don't want to be confronted with "advanced" settings.

XMPP clients do use SRV records to determine which server to login to (_xmpp-client._tcp.example.org), but I am willing to accept that SRV records are for federation, and that well-known files are better suited for client-side apps.

It shouldn't matter anyway. The important thing IMHO is to make sure that users can log in with least effort and least surprises.

dbkr commented 7 years ago

This has just come up again on #matrix-dev: https://matrix.to/#/!XqBunHwQIXUiqCaoxq:matrix.org/$1492701491891267IJVIs:matrix.org

I've put what I think are the options into a spec proposal google doc: https://docs.google.com/document/d/1RHQ5DEA6_2IZ5m2UIysPf8CuYaup-NrincEbV4Ef3jE/edit#heading=h.ouq2tlo7ae5z

ara4n commented 7 years ago

Let's try to avoid this getting lost for another 2 years. A major reason for Mastodon's success seems to be not having to piss around with DNS, for instance.

Does anyone actually object to just adding .well-known URIs as the preferred discovery mechanism, keeping SRV as backup for weird people (e.g. Leo, Amdocs) who would find it easier to do DNS rather than edit their web root?

IT-Sumpfling commented 7 years ago

Strongly supporting the "let's not get this lost". I set up a homeserver yesterday because Matrix sounded like everything I and my friends searched for to replace Messengers like WhatsApp. But the missing "autodiscovery" definitely means this is not ready for mainstream users - or all of them must use matrix.org.

As for "what to implement": use all methods you know, define a order in which the clients have to try them. Every method will have drawbacks for one or another, but having several and knowing which one trumps the others will help to use "the right one".

Just a few known autodiscover-Methods from other software:

Sorry for my english, I'm not a native speaker ...

gergelypolonkai commented 6 years ago

For Matrix IDs this sounds great. The content might be the homeserver address to use, with an optional IS address (which would auto-fill) the “advanced” fields of the client.

uhoreg commented 6 years ago

What would be needed to get this done? There seems to be a general consensus that .well-known (with maybe an SRV fallback) is the way to go. So it seems to me like what is left is:

  1. deciding on what suffix to use within .well-known (probably /.well-known/matrix or /.well-known/matrix.json would be the most logical)
  2. deciding on what information needs to be stored and how e.g. { "homeserver_url": "https://matrix.org/" }? Is any other information needed other than the base URL? e.g. whether open registration is allowed? (although extra metadata might be better served as a Matrix endpoint rather than as part of .well-known)
  3. decide on an SRV record type (maybe _matrix-client._tcp)
  4. decide if the Identity Server can be used to return the homeserver when doing 3PID lookups for login (ref. vector-im/riot-web#3930)
  5. writing up the spec
  6. register with the Well-known URI registry (as per 5.1 of RFC5785)
  7. implement it in all the clients
  8. ...
  9. Profit!!!

Is there anything else that needs to be done?

(re. 1. and 2.: let the bikeshedding begin!)

Edit: reorder steps as per @IT-Sumpfling 's suggestion

gergelypolonkai commented 6 years ago

An Identity Server URL might also come in handy. I know there are plans to federated those, too, but until then if I trust is.example.org but not the evil vector.im one, it would be nice to let clients know. Also, if my HS is not federated, maybe my IS isnʼt either.

Also, I vote for JSON, as clients already have to speak JSON anyway. Or was that question only about “JSON file having the .json extension or not”?

IT-Sumpfling commented 6 years ago

For

  1. I would vote for /.well-known/matrix/matrix.json - this leaves space if there is ever need for additional ".well-known matrix infos" e.g. /.well-known/matrix/cert-pinning.pem (not a good idea, just an example)
  2. at least the homeserver_url, better also the identity-server. Start slow, prepare to extend :)

"5." should be decided upon before "3." - because the SRV-entry should be part of the spec (and yes, _matrix-client._tcp sounds good) "5.b." decide if and how an identity-server could return this info (if it doesn't already do this)

  1. should contain an DEFINITE order in which the various things ( /.well-known , SRV-entry, perhaps well-known hostname, perhaps identity-server) are tried and which errors (e.g. HTTP-404 for /.well-known, timeout, ...) trigger trying the next possibility. Because if you leave this to the clients, things will break for one client, but not for another (and you don't want this - see XMPP)

"8." Grofit ;)

uhoreg commented 6 years ago

@IT-Sumpfling you're absolutely right about doing 5 before 3. That was a brainfart on my part. I've updated the list to reflect this. I've also added a step regarding Identity Servers.

As far as using a matrix directory, I personally would prefer to keep things simple, especially if we want SRV to be approximately-equivalent to .well-known. I think that JSON should give us more than enough flexibility.

For the same reason, I would also suggest against puting identity server information in .well-known; if we want to discover the identity servers, I think it would be better to add an endpoint to the homeserver for that, so that clients who use SRV can still access the information.

@gergelypolonkai the question was a broad question that included both what format to use (XML or JSON), and about whether to use a .json suffix for the file. But it was also about the base filename as well (e.g. matrix, matrixdotorg, ...) Personally, my vote would be for matrix.json (and of course using JSON format).

IT-Sumpfling commented 6 years ago

Hmm, I can see where you are heading, still I would vote for a sub-dir for the following reasons:

  1. plays nicer with additional .well-known-URIs on the same server (e.g. for letsencrypt: /.well-known/acme-challenge/ )
  2. makes it easier when registering the well-known URI (see https://tools.ietf.org/html/rfc5785#section-5.1 ). I guess it would suffice to register the "namespace" /.well-known/matrix/ and have complete control over all "subspaces" - without having to re-register with IETF (and you could always set a default-document for /.well-known/matrix/ on your webserver)

To the other points:

IT-Sumpfling commented 6 years ago

Oh, and just read the FIRST entry again - we should probably include two entries in the .well-known-File: one "client-homeserver" and one "federation-homeserver" (again in parallel to the SRV-entries ...)

maxidorius commented 6 years ago

personally I would never go with duplication of information, it's the best way to introduce human errors and duplication of work.

Federation already has the most adapted way: a SRV record. there is no need to work further on this. For clients, the only common available method for all known clients is .well-known, so let's just go with that? It's not a matter of which method you choose, as long as the method makes sense for all the concerned parties and that there is only one.

As for the content of the well-known entry, you will need at least two since those are the two that you can currently configure in clients:

maxidorius commented 6 years ago

but so far the identity-server part is so centralized and non-federated that it's better to leave it.

@IT-Sumpfling You should check out mxisd then. matrix.org and sydent are NOT the only pieces of the puzzle here.

IT-Sumpfling commented 6 years ago

As for the content of the well-known entry, you will need at least two since those are the two that you can currently configure in clients:

Homeserver URL
Identity server URL

Hmm, I still do not see how identity server would make sense in .well-known Yes, you can configure an identity-server in clients, BUT isn't that used before .well-known would even be queried?

AFAIK the identity-server is used to resolve from an ID in the form user @ maildomain to a "fully qualified" matrix id in the form @ user : matrixdomain - and THEN you could query .well-known on matrixdomain for the homeserver . OK, as said above, I could probably use an identityserver per maildomain but even this this will not work in many cases (e.g. freemail-Adresses with matrix.org Users).

And then I didn't even start to take into account the resolution from mobile numbers to matrix-ids ... on which domain would you query the .well-known for the identity-server if I try to login via +49 123 456789

Correct me if I'm wrong, but at the moment I would say: if the user tries to login via phone number or email and is NOT registered at the "well-known central Identity-server" - he or she must definitely enter an identity server URL

maxidorius commented 6 years ago

@IT-Sumpfling the identity server URL can be input when you login, so auto-discovery of it would be at the same time as discovery the homeserver.

As for how it works, that's not correct and you don't take into account other mechanisms. I would let you read the IS spec and mxisd README if you want more details, but ignoring the identity server URL in a well-known is a big no-no for me (at least with the current client mechanisms)

IT-Sumpfling commented 6 years ago

Hmm, I take for granted that you are more experienced regarding IS specs than me (I just skimmed over them now) - but IMHO for a identity-server entry in .well-known to make sense, the following questions would have to be answered:

  1. should it be used for client-login ID-resolution (i.e. resolve anyone @ hotmail.com (email) to @ user : example.org (matrix-id) )? If yes: which domain should I query for the .well-known to find the identity server? hotmail.com ? If yes, I would be FORCED to register my matrix-ID at hotmail.com ... Phonenumber-resolution - which domain to query for .well-known? Really an chicken - egg problem (IMHO)
  2. If not used for client-login - why is it better to provide the id-server associated with the example.org homeserver via .well-known instead of returning it from a call to the example.org homeserver?
  3. The IS-specs only talk about resolving an 3PID to a matrix-ID - never the other way round. So I see no way to query for the id-server for " @ user : evildomain " and find out about his email ...
  4. even for server to server communication I do not yet see a way this would be helpful - but as I said - so far I have no idea how a sensible Identity-server federation is going to work anyway ... if I want to resolve user @ hotmail.com - do I ask the matrix.org ID-Server or the hotmail.com ID-server (if there is one) or my own personal ID-server?
uhoreg commented 6 years ago

I'm not sure that "plays nicer with .well-knowns" or "is easier to register" is a reason for prefering a directory, as I think it's more common to just have a single file rather than a directory. I haven't looked through all the registered .well-knowns, but of the ones I've seen, most of them are single files, acme-challenge, dnt, and posh are directories, webfinger uses query parameters (let's not do that), and caldav and carddav are HTTP redirects. My preference is still to use a single file in order to encourage us to keep things (relatively) simple, but I do also understand the desire to make it extensible.

As far as Identity Servers is concerned, I just realized that there are two different reasons that you may be asking about identity servers for a given domain. The first one is: "I want to log in as @uhoreg:example.com. What are the identity servers that example.com trusts/what identity server should the client be configured to use" and the second one is "I want to log in using the email address uhoreg@example.com. What identity server should I use to resolve that?", and those questions may have different answers. The first one, as I said before, I think would be better adding a new Matrix Client-Server endpoint to query the identity servers. The second one I could see as something that might fit in .well-known/matrix.json, but I'm also wondering if that could also just be handled by WebFinger/WebFist. It may also open users up to malicious email providers (e.g. Google decides that all gmail.com email addresses should be resolved using their identity server, regardless of what the user actually wants), but I can also see it being useful for, say, corporate deployments.

maxidorius commented 6 years ago

@IT-Sumpfling

  1. the Identity server URL field needs to be given at login time. You would query the Matrix ID domain of the user trying to login to find the Identity server that should be used, just like you would do it for the homeserver.
  2. Because the Identity server is client-side configured, not homeserver side. so the homeserver doesn't know which IS the client will use or has to use. synapse has a list of allowed IS URLs, but that's only for synapse and is not part of the spec.
  3. You don't query the IS to get the email of the user. you only find the IS that the client should use as a user of that domain.
  4. That's actually solved in mxisd as it does federation and auto-discovery using domains/DNS for applicable 3PIDs (like emails)
IT-Sumpfling commented 6 years ago

@uhoreg

My preference is still to use a single file in order to encourage us to keep things (relatively) simple, but I do also understand the desire to make it extensible.

=> Ok, lets go for /.well-known/matrix which should be trivial to configure (server-side) to return a file OR a default-file from the dir /matrix/ - maybe we can have the cake and eat it 😉

regarding the identity-server "duality" - yes, that was exactly what I was trying to say. To add complexity: even the answer to the first question "what are the ID servers that example.org trust" may NOT be what I configure at my client ... ok, a bit paranoid maybe, but just because I trust example.org (homeserver/login) does not mean that I trust everyone that example.org trusts (ID-server they "prefer"). On the other hand - being able to configure THE Id-server that the client uses (and NOT let the client override) probably is quite important for some companies (but I don't not know if "we" (matrix) should aim for this ATM). Even more reason NOT to do this in .well-known but in a matrix-endpoint - complete with authentication etc. - this could either leave the client ID-Server-field "open" or override it (at least for "complying" clients)

And the second case (which would fit into the .well-known) - yes, it would open users to malicious providers (@ gmail.com always returning a matrix-id of @ user-local-part : gmail.com and therefore collecting logins ...). So I don't know if I really want this or if the specs should not be changed to force login via a full matrix-id (which could also be something like @ +49123456789 : example.org ) ... but we are getting WAY off topic ...

smokku commented 6 years ago

There is RFC 6415 defining a standard of what to put to /.well-known/ to describe Web Host Metadata.

XMPP builds on this to define Alternative Connection Methods.

uhoreg commented 6 years ago

Yes, I had considered it, but I wasn't sure if service discovery was appropriate use for host-meta.

maxidorius commented 6 years ago

Formal proposal for this.

uhoreg commented 6 years ago

The two remaining issues were whether we would support removing the /_matrix/ prefix, and regarding underscores in the homeserver versus identity_server names.

After further discussion, removing the /_matrix/ prefix seemed like more work for little gain (nobody so far has had serious complaints about it), and given that the consensus seemed to be that either option was acceptable, although people had preferences, we decided to just leave the /_matrix/ prefix in for now. Removing the /_matrix/ prefix can be done in the future as a more general MSC.

Regarding names, we decided that we would use homeserver and identity_server, as the spec uses "homeserver" as one word, and "identity server" as two words.

So it seems like this issue is now ready to be turned into a spec PR.

ara4n commented 6 years ago

In the interests of tracking folks' approval on this one, it'd be good to get a thumbs-up reaction on this comment from @erikjohnston @richvdh @dbkr @turt2live @anoadragon453 and anyone else who cares (a bit like we've done on https://github.com/matrix-org/matrix-doc/issues/1232#issuecomment-398781476 to track agreement before declaring the proposal having passed review). (@uhoreg doesn't count as he's shepherding :)

turt2live commented 6 years ago

There's a couple unanswered questions in the proposal still - I've added comments on the doc. They are probably answered in the comment thread here and haven't been transferred to the doc yet.

uhoreg commented 6 years ago

Regarding the prefix stuff, yes, the decision was "Option 0: Do nothing", to avoid scope creep and add extra work for client authors.

Regarding IGNORE, good catch. IIRC, the spec doesn't specify any current client behaviour, so it seems like "before the auto-discovery process" would mean "do whatever you want", which may not be the most helpful thing. It seems to me like PROMPT would be a better replacement option than FAIL_PROMPT, where I understand the difference between the two to be: PROMPT="ask the user for the URL", and FAIL_PROMPT="tell the user that something unexpected happened, and ask the user for the URL". @maxidor do you have any comments on this?

maxidorius commented 6 years ago

To clarify: Whatever clients would do if there was no well-known auto-discovery is Step 1. Well-known auto-discovery is considered Step 0. IGNORE means carry on as if there is no .well-known auto-discovery, and directly go to Step 1.

This ensures:

njouanin commented 6 years ago

May be off-topic... It may happen that HS implementations provide specific API endpoint (for example admin extension, ...). Does it make sense to say (this is a proposition) that these API should be placed in a specific endpoint tree like /_synapse/ or /_plasma/.

uhoreg commented 6 years ago

@njouanin It is a bit off-topic, and I don't know if there is any official suggestion for that, though my own personal feeling is that it makes sense to put it in a specific endpoint tree. As far as discovery of these endpoint trees, since the well-known is a JSON file, one could put in a specific key that can be checked. For example (purely for illustrative purposes), one could add in a "org.matrix.synapse": {"admin": "https://example.com/_synapse/admin"} or some such thing, and this would be defined by the server author.

uhoreg commented 6 years ago

Upon further discussion, and better understanding @maxidor's reasonings, it seems like we have a consensus that a 404 should be left as-is, to allow for implementation-specific behaviour (such as default homeservers) and/or future discovery mechanisms. The actual PR that comes out of this may include clarifying or modified language to make this clearer.

turt2live commented 6 years ago

I didn't mean to open a can of worms on this, however it's still not explicitly described what that process is (outside of comments on here).

The only reason I'm pushing for it to be in the google doc is so that the doc and the comments here are aligned. It's been incredibly frustrating in the past to track down where something was said to find out why it differs, particularly when trying to figure out the motivation for a particular design decision. Given the possible surface area for comments is 2+ rooms, 2 docs, this issue, and the upcoming PR, it'd be nice if the more static places (docs, issue, PR) are aligned.

uhoreg commented 6 years ago

Yes, I plan on updating the Google doc with a summary of the discussions, and will probably do so some time after I get back home (Tuesday).

uhoreg commented 6 years ago

Done via matrix-org/matrix-spec-proposals#1359