.well-known is a poor fit for key configuration

martinthomson commented 2 years ago

The document talks about discovering targets, or oblivious target resources. It also specifies a location for a key configuration.

The key configuration is a property of the oblivious request resource. As the configuration provides information about oblivious target resources, this is not a good fit.

Also, as a single oblivious request resource is able to serve a large number of target resources, I would instead suggest that a link relation - a more appropriate a method of providing any sort of configuration for a single resource - is also better suited to this application.

.well-known provides information about entire servers. Origin-wide configuration, where there might be multiple oblivious request resources on the same server, is inflexible.

If the goal is to facilitate discovery without specifying paths, then putting the path of the oblivious request resource into the SVCB (HTTPS really) configuration might work. Or, if you don't like your configuration too explicit, a well-known location for an oblivious request resource that is able to forward oblivious requests to any oblivious target resource on the same origin might also work.

tfpauly commented 2 years ago

As terminology has improved, the document now is clear that this is about identifying an oblivious gateway resource and its configuration.

I don't think a link relation, directly, would make sense for this discovery mode (since you'd need to get the relation from some original request with a link header), but there may be a good analogy we can create in the SVCB parameters -- adding more info to let the gateway be separate from the target, and being explicit about that.

tfpauly commented 2 years ago

@martinthomson I'd be curious to hear your thoughts on the direction we could take with #21.

martinthomson commented 2 years ago

I'm not sure about #21. Maybe I'm reading it wrong, but it seems like it is jumping to the end without building the necessary foundation pieces.

First, let's set aside the key configuration piece for the moment. It's important, but I'll get back to that.

Without specifying a specific destination, a client that wants to make an oblivious request needs to construct a route that comprises a relay, gateway, and target. Each node needs to be willing to talk to the other (client<->relay, relay<->gateway, gateway<->target) for various reasons.

You can approach this problem several ways. The first, obvious, and wrong option being that the client talks to its relay and asks where it can go. A relay that is configured for a small subset of destinations might be able to give an answer that is narrow enough to be useful. For instance, if we had a specific configuration that was just for DoH with a specific provider, then the relay would be able to help out.

I expect that relays will be somewhat more flexible than that, which tends not to be very useful if you are looking to learn where you can go. So it is likely that asking the relay is a poor choice.

Taking it from the opposite end, the client might seek to turn a direct request to a target URI into an oblivious request. Maybe it even has a set of target resources that it wants to make requests of. Again, to use DoH as an example, a DoH GET request is a set of target resources that a client might want to make oblivious.

In that model, you might consider asking the target URI about gateways that it is willing to talk to. In that case, the target is likely to have a far narrower view of what it is willing to use. Indeed, the best deployment of oblivious HTTP has the gateway resource running in the same server as the target resource. That way, "forwarding" the decrypted request doesn't even need to involve memcpy(), let alone a new HTTP request.

To that end, it makes sense to create an origin-wide configuration that says "if you want to make an oblivious request to any resource on this server, this is the oblivious gateway resource to use". I think that is functionally what you have here, almost. However, you are framing this as "find a gateway", which I think is wrong. If you are a client looking to establish a route, you don't start in the middle. The gateway is not a ends, it is simply a means.

So I think that what you have is a .well-known resource that makes a claim about all resources on a server. This resource identifies one (or more!) resources that can be used as oblivious gateway resources. All resources on that server can be accessed as a target resource using oblivious HTTP via any one of the identified oblivious gateway resources. (As you note, you probably don't want the gateway resource to be a target, so any gateway resource might be implicitly excluded...if it happens to be on the same server.)

For SVCB, when you present a dohpath, you are talking about a single resource, so you don't need to talk about multiple targets. Well, except that DoH with GET really is multiple resources. So the same applies: you have a set of target URIs for which you want to advertise a gateway. Attaching an oblivious-gateway parameter to SVCB makes perfect sense.

So - functionally - what you have is right, but without talking about the target resource, you have a fairly confusing document.

As for key configuration, this is not a server-wide property. It is specific to the gateway resource. The gateway resource might not even be on the same server (though this isn't a problem as you created a separate resource...). @bemasc suggested elsewhere that we use what I'll call "method overloading" to acquire key configuration from the gateway resource. I think that as much as overloading generates a knee-jerk negative response (too much experience with latent bugs when writing C++), that makes a lot of sense here. You need the gateway resource to speak for itself. Sadly, OPTIONS doesn't work very well as it's not natively cacheable.

You write that .well-known is used to avoid targeting clients with specific key configurations. You even acknowledge that your defense doesn't work. If the server is able to choose a per-client identity for the gateway resource, then it doesn't need a per-client key configuration. The identity of the gateway resource will identify the client adequately. Relying on the DNS configuration being hard to manage is not a good defense for that. My only conclusion there is that we'll need a consistency scheme for the identity of that resource too.

Putting this all together, a client starts out with:

a target URI
a trustworthy relay that is willing to talk to virtually any gateway, configured with a URI template
a desire to use oblivious HTTP for that target URI

The client starts by turning the target URI into a .well-known resource. It queries that resource (optionally, via the relay[^1]) and determines that the server has a gateway[^2].

The client then queries the gateway (again, probably via the relay) and requests key configuration[^2].

The client makes an oblivious request to the target via its configuration relay and the newly discovered gateway.

Alternatively, the SVCB case starts with the client having:

a resource name
a service type
a trustworthy relay
a burning passion for making an oblivious request

The client does the SVCB thing and gets a record that includes oblivious-gateway, plus the identity of the target resource. It can skip the bit where it fetches the .well-known resource, though it might need a consistency check that involves that fetch. It can then go straight to talking to the gateway and then the oblivious request.

Does that all make sense?

[^1]: This only requires that the relay forward GET requests as well, which doesn't seem like a huge imposition here. Ben's draft suggests some pretty aggressive caching for this, which is a nice optimization - for both performance and privacy.

[^2]: Let's leave the consistency thing as an unsolved problem for the moment. All we need to do is describe how the information is established, not whether it is canon (correctness, not consistency...yet).

tfpauly commented 2 years ago

Thanks, @martinthomson. That does make sense.

I've pushed one commit to the branch to try to further clarify the target vs gateway. Some of this text got muddy in the transition of terminology.

The point should be: the client is interested in a target resource, and in the process of fetching information (via DNS, etc) about the host that serves the target resource, it learns that (a) target resources for this service/host are accessible using Oblivious HTTP as Oblivious Target Resources, and (b) there exists an associated Oblivious Gateway Resource that ought to be used to access the target resource.

I certainly think there can be further work done to improve the description and explanation, but based on your discussion above, I think we're agreeing on the intent of the mechanism here.

Regarding the two usage scenarios you mention at the end, we're trying to address the latter case. The former may be interesting, but I haven't thought through any use cases for that. You're correct that in the latter case, the client doesn't need to be the one doing the well-known URI lookups — it can be the trusted relay.

I think the mechanism to discover the key configuration and the target resource path is still an interesting question. Here are the options that I see:

Identify the gateway via a name, and then use two separate well-known URIs on a gateway's name to separately serve the key configuration and the gateway requests. This is what the document describes.
Identify the gateway via a full URI, and try to overload a way to grab the key configuration for that URI. As you mention above, I think this is a dangerous way to go.
Identify the gateway and its config in two separate URIs in the discovery mechanism (SVCB, etc). This makes them separate and avoid well-known, but it doesn't necessarily seem better to me than just containing the entire config in the SVCB parameter instead, which was rejected in earlier discussion because allows various targeting or impersonation attacks. The problem here is that there isn't enough to tie the config and resource together.

Hence, I didn't see a significantly better option than (1).

Thoughts?

martinthomson commented 2 years ago

So I don't think that (1) is the best option. Maybe I wasn't clear, but I think that (2) is better. That way, the gateway resource speaks for itself.

.well-known can and should be avoided here. When naming .well-known, it's a shame that we missed the opportunity to name it .block-chain. It's the first tool people reach for, when really it should be the last.

tfpauly commented 2 years ago

@martinthomson For (2), what is request that we should send to the gateway resource to ask it what its key configuration is? Did you have a specific idea in mind?

martinthomson commented 2 years ago

GET. See https://bemasc.github.io/access-services/draft-schwartz-ohai-consistency-doublecheck.html#section-4.2-3 (which describes the request going to the gateway via the relay.

tfpauly commented 2 years ago

Got it, I can adjust to that.

bemasc commented 2 years ago

@bemasc suggested elsewhere that we use what I'll call "method overloading" to acquire key configuration from the gateway resource.

FWIW, that's not quite how I think about it. What I was suggesting was really "indirection", not "overloading":

The Gateway is identified by a URL (URL #1) holding a config file.
You fetch (GET) this config file, which contains a URL (URL #2) of the actual Gateway Resource.
You use (POST) URL #2 as the Gateway Resource.

It's true that you could make these two URLs the same, and use "method overloading" to disambiguate the requests. I don't really know why you would want to do that, but also I guess it's fine.

I haven't attempted to fully grasp #21 yet, but I do think .well-known might actually be a good fit for some variants of the "Oblivious DoH transparent upgrade" that seems to be a goal of this draft. In my view, the key observation is that, in Oblivious HTTP, we assume (perhaps wrongly...) that the "bootstrap seed" is consistent (i.e. not targeted), and require that any subsequent operations maintain that consistency. If the "bootstrap seed" is a domain name (e.g. DDR by name), then .well-known may be a natural way to discover a corresponding "ODoHv2" Target and Gateway. I think dohpath and OHTTP probably don't work together, due to consistency concerns.

tfpauly commented 2 years ago

Okay, @martinthomson, I've adjust #21 to include the full gateway URI in the SVCB record, and use a GET to the URI to get the key config.

tfpauly commented 2 years ago

@bemasc I don't think dohpath is a problem here — dohpath is the path of the target resource. It's still useful when making the end-to-end encapsulated request. The gateway URI to which you send your encrypted requests is unrelated to that.

bemasc commented 2 years ago

The problem is that dohpath breaks the consistency assumptions of Oblivious HTTP.

Let's say you start with a resolver known by its name, resolver.example. You assume (without proof!) that lots of people are using resolver.example. You also know that resolver.example support "ODoHv2". How do you connect? First you do a SVCB query for _dns.resolver.example, which gives you the dohpath ... but this query is not consistency-protected, so resolver.example could return dohpath=/dns-query-123456789{?dns} (with TTL=0 to prevent DNS caching). Now you issue "oblivious" queries to https://resolver.example/dns-query-123456789, but your queries are all linkable because you are the only client hitting that URL. This breaks OHTTP's privacy claims.

This is why I think we need to encapsulate the Target and Gateway URIs together, and apply consistency checks to them as a unit.

tfpauly commented 2 years ago

Ah, this is about the consistency checks. I think that's interesting to discuss, but I don't think it directly impacts this mechanism. We need to have a broader discussion about the approach to consistency checks. Checking for consistency as a unit is separate from encapsulating them together in the discovery mechanism.

ietf-wg-ohai / draft-ohai-svcb-config

.well-known is a poor fit for key configuration #17