epimorphics / elda

Epimorphics implementation of the Linked Data API
Other
53 stars 27 forks source link

Issue #215 forward header support #216

Closed simonoakesepimorphics closed 6 years ago

simonoakesepimorphics commented 6 years ago

Thanks for your comments Stuart. I have changed the proto and host overrides to be independent of each other, and made sure that the host of the base URI is retained even if it doesn't have a scheme (added a unit test for that scenario).

The RouterRestlet class already publicly depends on HTTPServletRequest and UriInfo doesn't provide header information, so I think it is appropriate to use it a parameter type.

requestURI can be obtained from the UriInfo argument so I have simplified the method signature by removing requestURI.

skwlilac commented 6 years ago

@simonoakesepimorphics Thanks for the updates, and for spotting the potential problem of handling port numbers in the X-Forwarded-Host header. It looks good on a read through. Would be good to test in a live situation with externally sourced headers. I'm wondering whether that can be addressed with some additional test cases in the elda-testing-webapp sub-project. Then it would probably be good to see if there was a more live scenario in which it could be tried out. @bwmcbride may be able to help set that up.

bwmcbride commented 6 years ago

I did a test deploy of Elda 1.5.1-SNAPSHOT to the LR dev presentation server. The page chrome URLs were correctly https urls when accessed via http and http urls when accessed over https. On the currently deployed version they are http urls even when accessed over https.

bwmcbride commented 6 years ago

@simonoakesepimorphics skwlilac Couple of further thoughts:

1) I didn't do anything in the Elda config to enable this behaviour. Is that OK? You don't want to break existing deployments. The judgement may be that it is safe to have the behaviour change as no user's will experience any untoward side effects. But if you want to be sure, then there should be some means of activating this feature and it should be off by default.

2) The version I tested was 1.5.1-SNAPSHOT. Wondering what the release version number will be. As its a new feature, assuming its not a breaking change, I'd expect a minor version number bump, rather than a patch number bump as suggested by the trailing 1 in the SNAPSHOT version number.

simonoakesepimorphics commented 6 years ago

@bwmcbride @skwlilac @ehedgehog

  1. I would tend to err on the side of caution by using a config option to toggle the feature, unless it seems too trivial and that it would be a nuisance to have to turn it on.
  2. Chris and I discussed the possibility of including this feature in the 1.5.0 release if we are satisfied that it's in a good state. Otherwise I am not sure what would be the best way to follow version number conventions so I will follow your lead on that.
bwmcbride commented 6 years ago

@simonoakesepimorphics @skwlilac @ehedgehog

Re versioning - I don't have a strong opinion. /me is just a humble user. It just looked odd having a x.x.1 version on a new feature. I guess you should stick to whatever versioning policy Elda has, unless there is good reason to change it.

I guess a little figurin' is needed on how to do the configuration. Where to put the configuration, whether its global, per endpoint, ... Also what is configurable? Is just on/off or are the headers used configurable (with defaults?)?

ijdickinson commented 6 years ago

I'm not sure I agree with using a configuration to turn on behaviour that ought to be normal (i.e. it is known to cause problems if the config switch is left in the 'off' position). I would feel pretty annoyed if I encountered this issue in a new Elda installation if it turned out that I had to set a config option first. Especially as we don't have a generator for 'default' Elda configs.

Existing installations will only break when a new version of the code is installed. People who install new versions in active installations ought to read the release notes first, and the release notes will say clearly that there has been a change.

skwlilac commented 6 years ago

FWIW re: the configuration question... it is not at all clear you would do the configuration. There seem to be 4 options:

a. it's just on. b. elda config, so per end point with per api: defaults c. web.xml - but that breaks the common pre-built .war file approach d. yet another config file - like the one for setting up config reload intervals (I didn't know that existed until recently).

a. is clearly the simplest. But... there is a question as to whether it opens up an attack vector (particularly X-Forwarded-Host because it can change the target host of links in responses). I think it really needs to be the case that X-Forwarded-* needs to be stripped from a request at the boarder (LB or proxy) and re-introduced as necessary to serve our own purpose. So I think there is a case for being able say that those are headers I wish to ignore.

skwlilac commented 6 years ago

@simonoakesepimorphics @ehedgehog

I've just been trying some wget with X-Forwarded-Proto and X-Forwarded-Host headers. When there is only an X-Forwarded-Proto header the code is not picking up the host name from the request URI instead is generating URI of the form: <first href="https:/elda/doc/bathing-water.xml?_page=0"/>

I think the problem is at line 86 of com.epimorphics.util.URIUtils

        URI mid = baseAsURI.isAbsolute() ? baseAsURI : requestUri.resolve( baseAsURI );

At thus point baseAsURI has a string value of: https:/elda/ which is an absolute URI (URI are absolute when the have an associated scheme (in particular it's not absolute even if it begins with a /).

The request I used was:

wget -S --header="X-Forwarded-Proto:https" http://localhost:8080/elda/doc/bathing-water.xml

With the configuration for the EA bathing water site as my test environment.

This uses and api:base value of /elda - hostless relative. I was expecting the process to pick up the host component from the request URI since there is nothing else overriding it.

There are 3 possible sources of hosts (in order of precedence):

simonoakesepimorphics commented 6 years ago

The feature is on by default and configurable at the API level. We can possibly support more / less granular control in future.