jakartaee / servlet

Jakarta Servlet
https://eclipse.org/ee4j/servlet
Other
261 stars 85 forks source link

New servlet request getRemoteScheme() for servlet 6. #402

Open pmd1nh opened 3 years ago

pmd1nh commented 3 years ago

The current servlet request getScheme() returns the name of the scheme used to make this request. As I understand, it is the scheme of the original client request (i.e the one before any proxies).

Consider a scenario:

Client ------(https)-------Proxy-------(http)--------AppServer

AppServer: request.getScheme() returns (https)

There is not a convenient way to find the scheme of the Proxy request to the AppServer (i.e (http) in this case)

There are several "Remote" APIs, for example getRemoteAddr(), getRemoteHost(), getRemotePort(), so should there also be a getRemoteScheme() to retrieve the last proxy's request scheme that sent to the server?

gregw commented 3 years ago

Whilst adding Local and Remote variants of getScheme would bring the API into line with Host/Addr/Port, I'm not so sure that is sufficient. Those methods have never been either fully defined nor sufficient for the complexities of proxying setups eg, multiple intermediaries.

Perhaps we should consider an API that fully supports https://tools.ietf.org/html/rfc7239 Forwarded header, so it can be be decoded easily? Something like Forwarded getForwardedHeader() and the Local/Remote APIs can then be deprecated in preference to the new API. Implementations supporting the old defacto standard X-Forwarded-For headers could still return a Forwarded object, which could even report which Header(s) it was generated from.

stuartwdouglas commented 3 years ago

I like the idea in theory, but I worry that people might start writing code that blindly trusts the getForwardedHeader() result. This is fine (mostly) if you are behind a proxy, but if that code then gets deployed to the open internet an attacker can effectively spoof their address by sending a forwarded header.

Also in terms of X-Forwarded-For vs Forwarded the system would need to be configured for one or the other, not both. If you do Forwarded with an X-Forwarded-For fallback an older proxy that only understands X-Forwarded-For might let an attacker send through a Forwarded header which would take precedence over the actual header.

I guess I actually don't like the idea, as I am worried that users won't understand the security implications here.

gregw commented 3 years ago

@stuartwdouglas I guess with current forwarded header handling, the container needs to be configured to even pay attention to any of those headers. So I agree having a raw getForwardedHeader method would be bad for security. However, IF it can be gated by only being available if the container has been so configured, then it would be good to standardise how that information is presented.

stuartwdouglas commented 3 years ago

I think that would be ok. One issue though is that at the moment if Undertow is configured to use these headers it will overwrite the getRemote* values, and return the actual values of the end client, not the proxy (and I assume other containers work the same way).

I don't want to change that as it would break a lot of apps, and if we just provide a parsed Forwarded header you still don't get any information about the final proxy itself (i.e. the one with an actual connection to the container), which is what this issue is actually asking for, so maybe whatever object we return also has the physical connection properties as well as the forwarding info.

gregw commented 3 years ago

@stuartwdouglas yeah I think the existing methods should continue to work the way they are (but perhaps with better documentation). Depending on the exact headers and configuration, Jetty can overwrite not only the getRemote values but also getHost, getServerName, getScheme etc. I don't want to change any of that unless there is a really good reason.

However, they provide an incomplete picture of reality, so instead of adding more and more methods trying to give more detail, I think it would be good to have a single new method, that if asked will give an object that will describe everything it knows about the path from client to server: all hops, all intermediaries, all addresses,ports,protocols etc.