Implement mixed behavior: mitm proxy/tunnel proxy

helloritesh commented 8 years ago

http-mitm-proxy without "onConnect" exposing, how can I tunnel https directly without create httpsServer ?

In some case, proxy https request doesn't need decrypting : )

felicienfrancois commented 8 years ago

node-http-mitm-proxy is a MITM proxy. This means it have been designed to be able to read and alter the requests and responses, this is why it need to use a generated certificate to reencode altered content. Content Filtering Solutions does not require this because they are only able to forward or not forward requests and responses, without being able to read all encrypted data nor altering it. Thoose kind of "standard" proxy can:

Read hostname and methods of SSL requests but cannot read url, headers nor content of it
Forward or cancel requests but cannot alter it.
Read response content
Forward or cancel response but cannot alter it or fake it.

If you want to access more information from the request or being able to alter request or response, you must use a MITM proxy which have to decrypt and reencrypt requests and responses, with generated certificates.

felicienfrancois commented 8 years ago

For now, node-http-mitm-proxy does not supports mixing standard and mitm proxy behavior, but it can also be a good idea for a pull request. It would allow to simply forward without altering (and without the need of generating a certificate) on most hostnames.

It would speed up the proxy and generate less certificates for people that only need to read and alter requests and responses on a few hostnames.

This may be a bit tricky to implement but It is possible. I can guide you if you want to dig deeper in this.

helloritesh commented 8 years ago

Awesome.. Also, could we make this conditional? Meaning, mitm proxy for some urls and bypass a few. Any chance you could also point me to the code section which I could edit?

felicienfrancois commented 8 years ago

The switch can only be done at a hostname level and not at a request/url level. I will try to explain you why.

When a client make a HTTPS request, it first connect to the server and make SSL Handshake (i.e. get the public key of the server). This step is made once before a serie of request. Then it send the HTTPS request(s) to the host, encrypted with the public key. (the target hostname is in clear to allow relays to forward the request up to the server). Only the owner of the private key can decrypt and read the request data (only the destination server). Then the server send the response to the client, encrypted with its private key (which only him own). Anyone can decrypt the response with the public key and read it but the encryption certify that the issuer is the server AND prevent any modification of the content.

Standard proxy works the following way:

At connect/SSL handshake step: Forward server public key
At request step: Forward request which is encrypted with public key and so not readable by the proxy (except target hostname) OR don't forward the request (i.e. filter it based on hostname)
At response step: Forward the response to the client (can read it but not cannot change it) OR don't forward the response (i.e. filter it based on response content)

MITM proxy works this way:

At connect/SSL handshake step: Generate a homemade certificate (so the proxy own both the public and private key for the upcoming requests) for the requested hostname and send the public key to the client. On the other side, connect to the server
At request step: decrypt request with proxy private key, read and alter the request, recrypt request with the server public key
At response step: decrypt response from server, read and alter the response, recrypt it with proxy homemade private key and send back to the client.

So implementing mixed behavior is a bit tricky and can only done at a hostname level because 1°) the connect step is made before any request and with only the hostname information 2°) Browsers may cache public keys and detect when it changes so even cleaning generated certificates or changing behavior during a browsing period may be breaking

If you still want to implement this mixed behavior, you'll need to add an include / exclude hostname parameter and then, based on this parameter, change the connect step: https://github.com/joeferner/node-http-mitm-proxy/blob/master/lib/proxy.js#L202

For now it make the connection either to the internal http server or to one of the internal https server (or creates it if does not exists), which will handle the SSL handshake. You may want to pipe the socket connection the server to forward the ssl handshake.

So instead of this https://github.com/joeferner/node-http-mitm-proxy/blob/master/lib/proxy.js#L219 var conn = net.connect(port, 'localhost', function() { you could try var conn = net.connect(portFromRequest, hostname, function() {

switer commented 8 years ago

@felicienfrancois thanks for your reply, I just need onConnect callback and prevent default action of MITM proxy in some case.

As you say, CONNECT is hostname level, and it's hard to patch props and methods to ctx. So, using argument of native connect event's callback is better than "ctx", just as below:

Proxy.prototype._onHttpServerConnect = function(req, socket, head) {
  var self = this;

  // we need first byte of data to detect if request is SSL encrypted
  if (!head || head.length === 0) {

    return async.forEach(self.onConnectHandlers, function (fn, callback) {
      return fn.apply(self, req, socket, head, callback)
    }, function (err) {

      socket.once('data', self._onHttpServerConnect.bind(self, req, socket));
      socket.write('HTTP/1.1 200 OK\r\n');
      if (self.keepAlive && req.headers['proxy-connection'] === 'keep-alive') {
        socket.write('Proxy-Connection: keep-alive\r\n');
        socket.write('Connection: keep-alive\r\n');
      }
      return socket.write('\r\n');

    })
  }

I your opinion, it's right ?

felicienfrancois commented 8 years ago

Seems right to me. Maybe you should also prevent default behavior if err is not empty. PR welcome.

Also, I think it should be left undocumented for now because there are several drawbacks to hook into connect event:

connect is only used for https trafic (so there is an asymetric handling)
using mixed tunneling / mitm proxy may cause wierd errors due to client certificates cache and HPKP

switer commented 8 years ago

@felicienfrancois I had added onConnect for CONNECT method, and split _onHttpServerConnect to two methods. Can you review and merge my PR ?

felicienfrancois commented 8 years ago

@switer reviewed and merged. Thank you for your work.

switer commented 8 years ago

Tanks @felicienfrancois , and I PR again for the missing onConnect of middleware.

Octolus commented 7 years ago

Hello. I believe I'm looking for similar solution. I want to tunnel the data to another reverse proxy server, then return the data (based on what the other rproxy returns.. which is a nginx rproxy).. any solution ??

aDu commented 7 years ago

I would love this feature to be implemented. Can someone implement this?

joeferner / node-http-mitm-proxy

Implement mixed behavior: mitm proxy/tunnel proxy #50