spec: consider well-known URIs for discovery

jonboulle commented 9 years ago

At FOSDEM someone suggested that a more well-specified/standardised means for serving discovery metadata would be to use well-known URIs instead of HTML meta tags - RFC5785.

The basic idea is that there is a centralised (IETF) registry of well-known URI names which are then exposed at $URLROOT/.well_known/$NAME. So in our case it would probably be .well_known/ac-discovery or so - and this endpoint could serve a simple JSON file or so with the URL templates. Presumably initially we would just do this "unofficially" and then at some point we could actually try to register it through the process described in 5.1

Arguably this is less intrusive than meta tags as it does not requite the HTML of the page(s) (whether root or leaves) to be modified.

I am not convinced it is necessarily appropriate ("well-known URIs are not intended for general information retrieval or establishment of large URI namespaces on the Web"), or a marked improvement over meta tags, but throwing this up for discussion.

jheiss commented 9 years ago

I think in any moderately large organization any form of discovery using the organization's bare domain is unlikely to fly for various reasons both political and technical.

The Marketing group that runs the example.com webserver probably won't understand or be capable of hosting this sort of metadata. Your Security group probably won't be willing to take the risk that this data leaks to the public, mingled as it is with the company's public website.

So I think most folks are either going to end up with some container metadata specific hostname (e.g. ac.example.com/reduce-worker), or they're going to go the DNS record route being discussed in #3.

Presuming folks use a form of HTTP discovery, then the question in my mind is what is the easiest to build automation around? People are going to want to run something like the Docker registry where users can push images and immediately deploy them.

The meta tag discovery mechanism requires statically or dynamically generating HTML pages with the requisite tags. That's technically simple, but doesn't feel very clean.

This .well_known URI scheme doesn't feel very nice either.

Ignoring the "prefix-match" portion of the current meta tags (which seems unnecessary to me) how about using HTTP Link headers? http://www.w3.org/wiki/LinkHeader Then you don't have to return an HTTP body at all.

philips commented 9 years ago

@jheiss I would say that an HTTP header raises the bar quite a bit higher than serving up HTML. For example most cloud object stores make you jump through a lot of hoops to add a custom header.

The prefix-match portion is necessary so that you can host different projects at different hosting providers. For example example.com/open-source-project might be hosted as a simple github.com container but example.com/enterprise-edition might be a super sophisticated geo-replicated thing with auth.

Also, returning a full body isn't necessary, you can just return a body with just meta tags if you have a "smart" registry that checks the ?ac-discovery=1 GET parameter.

I agree that alot of orgs won't want to use their official corporate domain but for a lot of the public projects that you can imagine being in containers like databases, webservers, programming languages, etc I don't think it is a huge stretch of the imagination.

philips commented 9 years ago

On the main point I have never really seen this well-known scheme used in practice: http://www.iana.org/assignments/well-known-uris/well-known-uris.xml

The notable exception being keybase: https://keybase.io/docs/keybase_well_known

thkoch2001 commented 9 years ago

I just started to read the spec and my first impulse was to fill the same issue. Please don't make up your own stuff for things that already have RFCs!

oleastre commented 8 years ago

Like others, I just came to this issue after trying to understand the discovery process. While using html meta tags for the discovery have some advantages, it requires a dedicated html page for each image (with a dedicated url that can conflicts with your website content).

So, having site wide meta information will probably be more scalable and easier to put in place.

In that perspective, .well-known url and DNS entries are existing mechanisms to give website-wide meta information. For reference, caldav and carddav defines a service discovery mechanism based on DNS entries with a fallback to .well-known uri; for reference, see rfc6764. With some adaptations, this mechanism can be used for appc.

Some other uses of .well-known uri includes:

Let's Encrypt with the acme protocol
Open ID connect; see also google's implementation

Just my 2 cents on this issue.

philips commented 8 years ago

We discussed well-known to a large degree in the past. I am not opposed to adding it to the spec if someone wants to do the work.

The two reasons we didn't end up using the well-known in the past:

1) We were inspired by the Go spec on this which is very similar with meta tags

2) The number of well-known users at the time we started appc was small and didn't seem to be growing. Notice that only 1 was added in 2014 and now nearly half were added in the last 3 months of 2015! https://www.iana.org/assignments/well-known-uris/well-known-uris.xml

On Fri, Jan 22, 2016 at 2:25 AM Olivier Samyn notifications@github.com wrote:

Like others, I just came to this issue after trying to understand the discovery process. While using html meta tags for the discovery have some advantages, it requires a dedicated html page for each image (with a dedicated url that can conflicts with your website content).

So, having site wide meta information will probably be more scalable and easier to put in place.

In that perspective, .well-known url and DNS entries are existing mechanisms to give website-wide meta information. For reference, caldav and carddav defines a service discovery mechanism based on DNS entries with a fallback to .well-known uri; for reference, see rfc6764 http://tools.ietf.org/html/rfc6764. With some adaptations, this mechanism can be used for appc.

Some other uses of .well-known uri includes:

Let's Encrypt https://letsencrypt.org/ with the acme protocol https://github.com/ietf-wg-acme/acme/blob/master/draft-ietf-acme-acme.md

Open ID connect http://openid.net/specs/openid-connect-discovery-1_0.html; see also google's implementation https://developers.google.com/identity/protocols/OpenIDConnect

Just my 2 cents on this issue.

— Reply to this email directly or view it on GitHub https://github.com/appc/spec/issues/160#issuecomment-173871741.

brianredbeard commented 8 years ago

Following up on this I think a combination of SRV record which can specify the location of the server from which a GET /.well-known/ac-discovery call would work should be a sufficient combination which follows relevant RFCs, is extensible enough to be used by a large number of operators, and would support failover nicely. I would foresee a record of the form:

_acdiscovery._tcp.example.com.   300  IN SRV 10 10  8080 int-ac-discovery.example.com.

appc / spec

spec: consider well-known URIs for discovery #160