greim / hoxy

Web-hacking proxy API for node
http://greim.github.io/hoxy/
MIT License
597 stars 97 forks source link

Feature request: Get original incoming request headers #72

Open sholladay opened 8 years ago

sholladay commented 8 years ago

I have the need to access data (at the moment, only the headers) of the original incoming request, as it was before hoxy modifies it for sending, from within a reverse proxy intercept.

In other words, given a setup like:

const server = require('hoxy').createServer({
    reverse : 'https://google.com'
});
server.listen({
    hostname : 'myproxy.com',
    port     : 9999
});

I need an intercept that can deduce whether a user accessed my proxy via 'myproxy.com' or its IP address '123.456.789.000'. I propose doing this by giving an intercept access to the original request (or at least its headers). At the moment, reading request.headers.host will tell me 'google.com' for the code above - it basically gets initialized to what the request will be, rather than what it was.

I think of the existing API as the outRequest, whereas I need to access the inRequest (or merely its headers, at the moment).

aemonge commented 8 years ago

You need the request interceptor:

// .......
    proxy = hoxy.createServer({
      reverse: proxyHost
    }).listen(program.port);
// .......
    proxy.intercept({
      phase: 'request',
      as: 'json',
      method: 'POST'
    }, requestInterceptor);
// .......
  function requestInterceptor(request, response) {
    console.log(request.headers.host);
}
sholladay commented 8 years ago

Nope. See, it is easy to get confused. Here is a full working example. This will print "google.com", not "localhost". As a result, it is better to name the parameters outRequest and outResponse, as that is what they really are. This ticket is a feature request to give me access to the inRequest.

'use strict';

const server = require('hoxy').createServer({
    reverse : 'https://google.com'
});

server.intercept('request', (outRequest, outResponse) => {
    console.log(outRequest.headers.host);
});

server.listen({
    hostname : 'localhost',
    port     : 9999
});
greim commented 8 years ago

Yup. It's overwritten before the very first intercept here: https://github.com/greim/hoxy/blob/85242564df3415f367ebcb040844be5d203a1404/src/request.js#L149

Something like

req.origHeaders; // frozen, unaltered copy of original request headers
res.origHeaders; // frozen, unaltered copy of original response headers

?

sholladay commented 8 years ago

Right. That would be okay.

As far as design goes, I would ideally prefer to have a distinct object that represents the inRequest and the outRequest, etc. This could be implemented in a semver minor or major version by having a request.in and request.out, which would require very minimal (and scriptable) refactoring for existing users even if it was semver major and all of the existing APIs were moved to request.out.

In this scenario, instead of:

request.origHeaders;

... I would have:

request.in.headers;

But this is more for my own OCD than anything.

greim commented 8 years ago

That would be a pretty sweeping conceptual change to how Hoxy works. If the origHeaders thing works for you, I'd be happy to add that though.

sholladay commented 8 years ago

I will take what I can get and be grateful. :)

That said, I could have someone from my team make a PR for this. I don't think it would have to be so drastic. It could be made 100% backwards compatible.

greim commented 8 years ago

Take a look at the orig-headers branch and let me know if it works for you.

sholladay commented 8 years ago

I realize now that I look at this again, I don't think just the headers are going to cut it for me. What I end up needing in my real world code is the full origin that the request came in as. In other words, if I am listening on 0.0.0.0, requests can came in through http://localhost, https://localhost, http://127.0.0.1, https://127.0.0.1, etc. and I need to have the correct data. That's actually not available via the headers. I was probably thinking erroneously of the HTTP Origin header, buts that's not correct for this case.

greim commented 8 years ago

You should be able to determine whether the client connected using https or http by:

  1. If the client connects directly to hoxy, by whether you're running hoxy with tls options.
  2. If the client connects through an intermediary like a load balancer, standard x-forwarded-* headers which are injected in-flight by the intermediary.
sholladay commented 8 years ago

Right, okay. A combination of some saved state and the origHeaders may be workable.