There seems to be some inconsistency between different client implementations on how to handle proxying when the destination is http only. The rfc suggests that CONNECT is used to setup a tunnel which can later be secured by TLS
The CONNECT method requests that the recipient establish a tunnel to the destination origin server identified by the request target and, if successful, thereafter restrict its behavior to blind forwarding of data, in both directions, until the tunnel is closed. Tunnels are commonly used to create an end-to-end virtual connection, through one or more proxies, which can then be secured using TLS (Transport Layer Security, [TLS13]).
This could be interpreted as you don't necessarily need a TLS tunnel and so a client is allowed to issue a CONNECT request for a http destination.
Either way, there might be more clients out there that send CONNECT requests for http destinations. This PR aims to support such clients.
However, I still believe that it is redundant to do so, since there is no advantage of a tunnel when the destination is http only. The proxy can still read the messages that go through. Even curl's default behavior is to send the http request directly to the proxy without issuing a CONNECT request for such cases.
Implementation details
The idea here was to "fool" the client when the proxy receives a CONNECT request. The proxy will respond back with OK but won't actually setup a tcp tunnel and try to process the request on its own.
However, in node.js when the server receives a "connect" event (i.e. a CONNECT request) it removes all data handlers from the client's socket including the in built node.js http parsers:
So the proxy is only able receive raw http requests by reattaching the data handlers (i.e. all the express request handlers are removed). Now for the proxy to process the request that comes after the CONNECT request it will have to parse the raw http request.
Instead of trying to parse the raw http request and missing any important details, I decided to spin up another forward proxy. The first forward proxy then sets up a tunnel between the client and the second forward proxy which then receives the http request and processes it easily.
Drawbacks
Spinning up another forward proxy is a costly operation. For future improvements, we should try to figure out the best way to parse the raw http or reattach the data and request handlers for the client's socket
Fixes https://github.com/kiegroup/mock-github/issues/72
There seems to be some inconsistency between different client implementations on how to handle proxying when the destination is
http
only. The rfc suggests thatCONNECT
is used to setup a tunnel which can later be secured by TLSThis could be interpreted as you don't necessarily need a TLS tunnel and so a client is allowed to issue a CONNECT request for a http destination.
There has been no response from the maintainers of
@actions/toolkit
, so I am not sure whether they will accept my PR to change their proxying library (see https://github.com/kiegroup/mock-github/issues/72#issuecomment-1569223066).Either way, there might be more clients out there that send
CONNECT
requests for http destinations. This PR aims to support such clients.However, I still believe that it is redundant to do so, since there is no advantage of a tunnel when the destination is http only. The proxy can still read the messages that go through. Even
curl
's default behavior is to send the http request directly to the proxy without issuing aCONNECT
request for such cases.Implementation details
The idea here was to "fool" the client when the proxy receives a
CONNECT
request. The proxy will respond back with OK but won't actually setup a tcp tunnel and try to process the request on its own.However, in node.js when the server receives a "connect" event (i.e. a
CONNECT
request) it removes all data handlers from the client's socket including the in built node.js http parsers:So the proxy is only able receive raw http requests by reattaching the data handlers (i.e. all the express request handlers are removed). Now for the proxy to process the request that comes after the
CONNECT
request it will have to parse the raw http request.Instead of trying to parse the raw http request and missing any important details, I decided to spin up another forward proxy. The first forward proxy then sets up a tunnel between the client and the second forward proxy which then receives the http request and processes it easily.
Drawbacks
Spinning up another forward proxy is a costly operation. For future improvements, we should try to figure out the best way to parse the raw http or reattach the data and request handlers for the client's socket