Make it safe to reverse proxy

mnot commented 8 years ago

This came up at the Workshop -- if a reverse proxy is deployed in front of a server, and that reverse proxy doesn't know about this convention, it can expose state to the back end server.

Whether or not that's a security issue depends on the information exposed.

One way to avoid this is to use OPTIONS with Max-Forwards, e.g.,

OPTIONS /.well-known/h2-debug-state
Max-Forwards: 0

However, the downside is that this would make it difficult / impossible to use from a browser, which is probably the point. If it's just meant for programmatic access, maybe a new frame type makes more sense.

Lukasa commented 8 years ago

So this is kinda vaguely dupe-y with #2: @louiscryan probably cares about seeing this issue too.

I think the general consensus is that the only security risk so far being exposed is the HPACK compression state. As far as I see it, then, we have some options:

Use OPTIONS. It feels weird, and you're right that if we did that we should really strongly consider using a frame instead.
One thing we could do is say that certain fields (e.g. HPACK state) should only be dumped if X-Forwarded-For (or some similar header) is not present.
We could say that the entire response should only be served if X-Forwarded-For is not present, given that if it is connection coalescing is probably happening and we should 404 instead.
We could just abandon dumping HPACK state.

I'm not sure which of those I prefer. And then we need to ask whether intermediaries should also be inserting themselves into this if they want to, and how they should do that.

mnot commented 8 years ago

How important is being able to just show this in the browser?

If it is, I guess the other option would be to require HTTP auth for it, or something similar.

Lukasa commented 8 years ago

I'd say it's moderately annoying if we can't show it in a browser. If you're writing a server, the obvious way to want to test it is with a browser. Requiring a separate H2 frame removes that freedom, which is a bit sad.

While I'm here, I should point out that if we want to mint a new H2 frame it gets quite a bit more annoying not to pass this spec through the IETF process: if we don't, there's always a risk that someone might stamp on our new frame ID!

Auth could work, though it's not clear to me that it solves the intermediary issue does it?

mnot commented 8 years ago

Use the experimental frame ids :)

re: auth and intermediaries - hm. Probably should get a Real Security Person to look at this.

Lukasa commented 8 years ago

That seems sensible to me. Got anyone in mind? My Real Security People aren't really H2 people.

mnot commented 8 years ago

@ekr?

ekr commented 8 years ago

@mnot, @Lukasa I'm still trying to wrap my head around the concern here. @Lukasa says that it's HPACK compression state, so I'm assuming the concern here is that we have connection sharing and I could use that to extract other people's data? Is that it?

If so, I have some other potential concerns: what if the proxy uses headers and the like to communicate its own state to the server? I might be able to get the server to act as an oracle for that.

Lukasa commented 8 years ago

I'm assuming the concern here is that we have connection sharing and I could use that to extract other people's data? Is that it?

@ekr Yeah, the concern is that HPACK state is only safe to expose hop-by-hop: there is no guarantee that anywhere beyond the first hop it's safe to expose that state.

If so, I have some other potential concerns: what if the proxy uses headers and the like to communicate its own state to the server?

Yup, this is another good reason to not want to expose the HPACK state on that second leg. I'm generally of the view that it's only safe to expose HPACK state one hop.

ekr commented 8 years ago

Given this, my suggestion would be to just not dump HPACK state (note: I haven't analyzed anything else here). We can always invent some new way to indicate that that's OK later.

bradfitz commented 8 years ago

I haven't seen anybody propose using this mechanism to let one connection debug another connection's state, and I don't think we want to open that can of worms, so I think OPTIONS is fine, at least to get HPACK state.

It's cute to demo this in the browser to show people what it does (and we can still do that with GET, as long as GET doesn't return HPACK state), but in reality you're going to use this from the TCP connection you're debugging, and unless you're @mcmanus you're probably not using your browser to debug your HTTP/2 client implementation, so it's not not particularly onerous to make your HTTP/2 client send OPTIONS instead of GET.

PiotrSikora commented 8 years ago

Please note that not every proxy respects Max-Forwards request header (i.e. NGINX doesn't), so we need to handle this at the HTTP/2 level with a new frame (see #9) in order to guarantee this being hop-by-hop.

bradfitz commented 8 years ago

If GET and even OPTIONS can be proxied, and adding a new frame (#9) is too complicated, what about adding a new boolean setting which a peer must advertise in their initial settings frame to declare that they understand /.well-known/h2/state and won't proxy it? Only if that setting is true should peers ever query the state.

Lukasa commented 8 years ago

If GET and even OPTIONS can be proxied, and adding a new frame (#9) is too complicated, what about adding a new boolean setting which a peer must advertise in their initial settings frame to declare that they understand /.well-known/h2/state and won't proxy it? Only if that setting is true should peers ever query the state.

So my concern with this idea is that it protects us against only legitimate users, not malicious ones. If you choose to ignore the setting and have a naive proxy in the way, the proxying will still happen and the HPACK state will be exposed to the malicious user.

bradfitz commented 8 years ago

Consider:

   [ Malicious UA]  -----> [ Naive Proxy ] -----> [ Backend ]

And let's say the Backend supports /.well-known/h2/state. First, Backend will only serve that over h2, not h1. And the backend should only return with it if it saw the initial SETTINGS frame bit from the Naive Proxy. The Naive Proxy, being naive, would not have sent the SETTING, so the Backend should return an error if Naive sends it that request, proxying it from the Malicious UA.

Or what scenario were you considering?

Lukasa commented 8 years ago

Oh, I see, sorry. I misunderstood: I assumed that only clients were listening for it. Yes, the opt-in via client SETTINGS bit seems like a sensible middle-of-the-road compromise.

bradfitz commented 8 years ago

How do we get a number?

Lukasa commented 8 years ago

@mnot knows the rules, but I think we just pick something in the upper end of the range. random.choice(range(2**15, 2**16))?

bradfitz commented 8 years ago

I like to pick my arbitrary numbers based on the numbers immediately above a certain word.

25153 (0x6241) is above the keys "STATE" on a QWERTY keyboard.

Lukasa commented 8 years ago

@bradfitz Done, 0x6241 it is.

@mnot, @PiotrSikora, @louiscryan: How do you feel about @bradfitz' proposed approach here? Essentially, clients need to send SETTINGS_SEND_ME_DEBUG (TODO: better name). Proxies that don't support this function must strip that settings bit before forwarding, and servers must only respond to this URI if they've received the settings bit. That allows proxies to opt-in (and do whatever transforms they want), and allows servers to ensure that they're not behind a naive proxy that will bust all the things.

PiotrSikora commented 8 years ago

I'm confused...

[ client ] --- [ proxy A ] --- [ proxy B ] --- [ server ]

Which proxy and/or server is expected to generate response to a query from the client? Do we expect it to make it all the way to the server or only to the first proxy?

Lukasa commented 8 years ago

@PiotrSikora

Client sets SETTINGS_SEND_ME_DEBUG=1.
Proxy A receives SETTINGS frame. If it doesn't recognise that setting, it strips it and cannot advertise it further upstream. Otherwise, if it does recognise and wants to allow it, it sets SETTINGS_SEND_ME_DEBUG=1 to proxy B.
Proxy B receives SETTINGS frame. If it doesn't recognise the setting, it strips it and cannot advertise it further upstream. Otherwise, if it does recognise and wants to allow it, it sets SETTINGS_SEND_ME_DEBUG=1 to server.
Server receives SETTINGS frame. It either contains SETTINGS_SEND_ME_DEBUG=1, or it does not.

Server receives a request for /.well-known/h2/state. If it received SETTINGS_SEND_ME_DEBUG=1, it generates a 200 and fills it out per this I-D. If it did not, it responds 404.

The reason this works is essentially that proxies A and B are only allowed to set this setting if they opt in to the proxy behaviour (that we have yet to write). That proxy behaviour will be, at minimum, stripping the HPACK state and replacing it with its own client-side HPACK state. It'll probably also be appending some information to tell you that there's a proxy there, though we may want to think about that.

Essentially, then, this means that the URI doesn't work unless every link in the chain supports it.

We could also define a fallback mode, whereby a proxy that supports this draft, when receiving a 404 for this URI, can choose to rewrite the response into a 200 with its own state, but I'm not sure how I feel about that yet.

PiotrSikora commented 8 years ago

Well, is the connection state from a server 3 hops away useful at all to the client?

Lukasa commented 8 years ago

Given that it can cause failures, I'd say yes.

PiotrSikora commented 8 years ago

But given that the response presumably won't contain states of other connections, it might result in false positives (i.e. flow control might be stuck because of issues between proxy A & proxy B).

PiotrSikora commented 8 years ago

Also, going back to my original question of "which proxy and/or server responds to this query":

What if proxy B acts as a web server for /static/, but proxies everything else to the server? Which one should generate response?
What if proxy B acts as a web server for everything but *.php, which it proxies to the server? Which one should generate response?
What if proxy A terminates HTTP/2 and sends traffic over HTTP/1.1 to proxy B? Should it generate response with its own state or return error?

Lukasa commented 8 years ago

But given that the response presumably won't contain states of other connections, it might result in false positives (i.e. flow control might be stuck because of issues between proxy A & proxy B).

As I mentioned earlier, we may want to extend this so that proxies that set the SETTINGS value also should amend the JSON document to add their own data in as appropriate.

What if proxy B acts as a web server for /static/, but proxies everything else to the server? Which one should generate response?

Depends on the proxy rewriting rules. In this case, /.well-known/h2/state isn't in /static/, so it proxies through and then optionally amends the JSON document.

What if proxy B acts as a web server for everything but *.php, which it proxies to the server? Which one should generate response?

This one is somewhat trickier. I'd say it should pass it through, but I'd be pretty nervous about making that a MUST.

What if proxy A terminates HTTP/2 and sends traffic over HTTP/1.1 to proxy B? Should it generate response with its own state or return error?

This one is self-contained. SETTINGS don't travel over HTTP/1.1, so it either needs to respond with its own state or 404.

PiotrSikora commented 8 years ago

This one is somewhat trickier. I'd say it should pass it through, but I'd be pretty nervous about making that a MUST.

To make it even trickier, what if *.cgi is passed to server2? Which backend server should this request be proxied to? Current proposal doesn't offer a mechanism to follow the path of a real requests.

This one is self-contained. SETTINGS don't travel over HTTP/1.1, so it either needs to respond with its own state or 404.

Assuming that it can respond with its own state... How is the client supposed to know whether it received the response from proxy A or server?

Lukasa commented 8 years ago

To make it even trickier, what if *.cgi is passed to server2? Which backend server should this request be proxied to? Current proposal doesn't offer a mechanism to follow the path of a real requests.

So that's true, but the only way we can avoid that is if we define a frame type that requires that intermediaries forward it as if it were a request to a URI in the frame. My initial instinct is that that's a pretty tough requirement to add.

Note as well that even if we did add that, reverse proxies can loadbalance between multiple backend machines. The only way to get the complete debugging information you're after is if we emit a frame that says "please route this frame exactly as you routed stream X", and there's just no way that intermediaries are going to volunteer to store the historic state for that frame to be appropriately handled.

How is the client supposed to know whether it received the response from proxy A or server?

It's not relevant. The client needs as much information as it can get about the HTTP/2 connection, but if at any point it stops being a HTTP/2 connection the question is simply ill-formed. Ideally the client wants information about the connection on each hop, but once it's out of hops it has to be done.

If we attempt to draft this specification to be a perfect oracle of connection state the whole way through a mixed H1/H2 multihop request/response cycle we will never succeed in deploying anything. We should make a reasonable effort to give the client as much information as possible, but I think we need to accept reasonable limits on how much detail we can provide.

PiotrSikora commented 8 years ago

So that's true, but the only way we can avoid that is if we define a frame type that requires that intermediaries forward it as if it were a request to a URI in the frame. My initial instinct is that that's a pretty tough requirement to add.

Note as well that even if we did add that, reverse proxies can loadbalance between multiple backend machines. The only way to get the complete debugging information you're after is if we emit a frame that says "please route this frame exactly as you routed stream X", and there's just no way that intermediaries are going to volunteer to store the historic state for that frame to be appropriately handled.

I believe that putting connection in a "debug mode" using SETTINGS with HTTP/2 frames that include connection state being part of the stream would address this issue (see #9).

It's not relevant. The client needs as much information as it can get about the HTTP/2 connection, but if at any point it stops being a HTTP/2 connection the question is simply ill-formed. Ideally the client wants information about the connection on each hop, but once it's out of hops it has to be done.

I disagree... You just said that the connection state 3 hops away can cause failures, and I don't see why the fact that one hop is done over HTTP/1.1 should change that, i.e.

[ client ] -- H/2 -- [ proxy A ] -- H/1.1 -- [ proxy B ] -- H/2 -- [ server ]

If we attempt to draft this specification to be a perfect oracle of connection state the whole way through a mixed H1/H2 multihop request/response cycle we will never succeed in deploying anything. We should make a reasonable effort to give the client as much information as possible, but I think we need to accept reasonable limits on how much detail we can provide.

Agreed... However, I'm questioning the usefulness of making this a multi-hop feature (at least in current design).

Lukasa commented 8 years ago

I believe that putting connection in a "debug mode" using SETTINGS with HTTP/2 frames that include connection state being part of the stream would address this issue (see #9).

Can you elaborate on how?

I disagree... You just said that the connection state 3 hops away can cause failures, and I don't see why the fact that one hop is done over HTTP/1.1 should change that, i.e.

It doesn't, but the question stops making sense. If we can't propagate the "I understand the security implications of this information" statement through the connection, then we have to assume that the next endpoint does not understand the security implications of the information. Put another way, if you can only make a hop over HTTP/1.1, it is safe to assume that the entity you're contacting that cannot speak H2 does not understand the security implications of delivering H2 connection state.

Agreed... However, I'm questioning the usefulness of making this a multi-hop feature (at least in current design).

It may be that it's not useful. That's certainly a valid question.

python-hyper / draft-http2-debug-state

Make it safe to reverse proxy #7