This came up based on an API user's request for developer.nrel.gov. After analyzing the past week's worth of log files, I'm pretty sure this shouldn't have any adverse impact, but I wanted to document this change here in any case.
When we receive an API request, one of the first things we have to do for routing the request is determine which host the request belongs to (so we can distinguish between requests for developer.nrel.gov, api.nasa.gov, etc). Currently the code for that logic like this:
ngx.ctx.host = ngx_var.http_x_forwarded_host or ngx_var.http_host or ngx_var.host
So basically, we take whichever option is set first in this order:
X-Forwarded-Host HTTP header
Host HTTP header
The server's name if a request without a Host header or empty Host header comes in
All valid requests will set a Host header, so that's the most commonly matched mechanism by far. The only case X-Forwarded-Host would be set is if there's another proxy sitting in front of api.data.gov that manipulates the Host header in some way. However, in these cases, it doesn't seem like matching the forwarded host is actually the most appropriate behavior, since then we're trying to match against some arbitrary host our system has likely never seen before. Instead, matching on the Host header seems more appropriate even in cases where there is another proxy in front of api.data.gov. We'd probably only want to match the forwarded host in special cases where we're the ones in control of the proxy sitting in front of api.data.gov (but based on traffic logs, we don't have any of those special cases).
So the change we're looking to make is removing X-Forwarded-Host from what will be matched, and using Host. So the logic simply becomes:
ngx.ctx.host = ngx_var.http_host or ngx_var.host
The reason for this change is to make it easier to put another proxy in front of an api.data.gov service in certain circumstances. I don't expect these circumstances to be very common, but since this seems like more correct behavior anyway, making the change seems like a good idea.
The specific case that triggered this was a user that needs to proxy to an NREL API for corporate policy and network reasons. They were also using Apache 2.2's mod_proxy to perform that proxy. Apache 2.2's mod_proxy forces the X-Forwarded-Host to be set, which was then causing 404 not found errors for the proxied API requests, since our servers were then trying to match the requests based on the private domain used by the company (eg, we were trying to match private-proxy.apis.example.com instead of developer.nrel.gov). Apache 2.2 has no way to disable this X-Forwarded-Host header (it's only possible in Apache 2.3.10+ since it's hardcoded in 2.2), so there wasn't a particularly easy solution on the client's side without upgrading Apache or using a different proxy server.
Based on 1 week's worth of raw logs, the X-Forwarded-Host header is not set on most of our requests. In the legitimate cases where it is set, the Host header has the exact same value, so those requests would be handled identically. This means I don't anticipate this very tiny code change having any impact (other than allowing these type of external proxies to be setup more easily in some cases).
This came up based on an API user's request for developer.nrel.gov. After analyzing the past week's worth of log files, I'm pretty sure this shouldn't have any adverse impact, but I wanted to document this change here in any case.
When we receive an API request, one of the first things we have to do for routing the request is determine which host the request belongs to (so we can distinguish between requests for developer.nrel.gov, api.nasa.gov, etc). Currently the code for that logic like this:
So basically, we take whichever option is set first in this order:
X-Forwarded-Host
HTTP headerHost
HTTP headerHost
header or emptyHost
header comes inAll valid requests will set a
Host
header, so that's the most commonly matched mechanism by far. The only caseX-Forwarded-Host
would be set is if there's another proxy sitting in front of api.data.gov that manipulates the Host header in some way. However, in these cases, it doesn't seem like matching the forwarded host is actually the most appropriate behavior, since then we're trying to match against some arbitrary host our system has likely never seen before. Instead, matching on the Host header seems more appropriate even in cases where there is another proxy in front of api.data.gov. We'd probably only want to match the forwarded host in special cases where we're the ones in control of the proxy sitting in front of api.data.gov (but based on traffic logs, we don't have any of those special cases).So the change we're looking to make is removing
X-Forwarded-Host
from what will be matched, and usingHost
. So the logic simply becomes:The reason for this change is to make it easier to put another proxy in front of an api.data.gov service in certain circumstances. I don't expect these circumstances to be very common, but since this seems like more correct behavior anyway, making the change seems like a good idea.
The specific case that triggered this was a user that needs to proxy to an NREL API for corporate policy and network reasons. They were also using Apache 2.2's mod_proxy to perform that proxy. Apache 2.2's mod_proxy forces the
X-Forwarded-Host
to be set, which was then causing 404 not found errors for the proxied API requests, since our servers were then trying to match the requests based on the private domain used by the company (eg, we were trying to matchprivate-proxy.apis.example.com
instead ofdeveloper.nrel.gov
). Apache 2.2 has no way to disable thisX-Forwarded-Host
header (it's only possible in Apache 2.3.10+ since it's hardcoded in 2.2), so there wasn't a particularly easy solution on the client's side without upgrading Apache or using a different proxy server.Based on 1 week's worth of raw logs, the
X-Forwarded-Host
header is not set on most of our requests. In the legitimate cases where it is set, theHost
header has the exact same value, so those requests would be handled identically. This means I don't anticipate this very tiny code change having any impact (other than allowing these type of external proxies to be setup more easily in some cases).