twitchax / AspNetCore.Proxy

ASP.NET Core Proxies made easy.
MIT License
525 stars 83 forks source link

Does all the traffic continue to go through the proxy? #51

Closed bnssoftware closed 4 years ago

bnssoftware commented 4 years ago

This is probably more of a basic proxy theory question, but I was curious if after the initial request from the client to GET from a certain URL and the proxy server returns the new domain/path, does the proxy server continue to participate in the request, such as transferring data to the client? Or does the proxy server "terminate" at that point and the client is communicating directly with the end server?

twitchax commented 4 years ago

So, I think you are thinking more of a redirect. In that case, the server tells the client to go to a different address.

A proxy always, by definition, makes the request to the endpoint on behalf of the client. So, yes, the proxy would continue to participate, as the client would keep sending the requests to http://myproxy.com/, and the proxy would forward each of those requests to the endpoint.

softwareguy74 commented 4 years ago

So is there any concern with the performance of the specific implementation of the proxy as it tunnels the traffic back and forth? For example, does your implementation (with respect to JUST the tunneling) differ any from say: https://github.com/microsoft/reverse-proxy

I'm just trying to get a feel for how performant this proxy is compared to others like the above referenced or HAProxy, Nginix, etc. before I spend all my time working with it.

twitchax commented 4 years ago

In general, the implementation is not going to differ too much.

The meat of AspNetCore.Proxy is here.

The meat of the reverse-proxy is here. While the names they have chosen for their methods, and the names I have chosen for my methods are different (and they check for a few edge cases I have never encountered with this library), the implementations are very similar. Also, I might add, the method NormalProxyAsync is commented extremely well. They outline the whole method at the top, and each line of code that represents one of the steps is commented in the implementation. Really quite nice. 😄

At the end of the day, an HTTP proxy is going to inspect the incoming request from the client, make some decision about how to proxy that request based on the method type and headers, copy/mutate a few of the headers, copy the incoming request, and fire it off to the requested endpoint. Then, when the response comes back from the endpoint, it is going to do mostly the same thing, except it will ignore most of the headers, etc., and copy that response to the response that is sent to the client.

HTTP proxies proxy each request individually, and no tunnels are created. Web Socket proxies can "tunnel", and this library supports that. If you want a proxy that truly "tunnels", then you should check out a SOCKS proxy. SOCKS proxies open up two sockets: one to the client, and one to the endpoint, and then they literally just pump the bits through each of the sockets bidirectionally. You can find an extremely simple SOCKS proxy implementation I wrote in Rust here.

I have not benchmarked this library too much, but your question makes me want to at some point. It has been plenty fast for my purposes, and I have yet to get any performance complaints. However, I don't think twitter is using this, so take that with a grain of salt. 😛

All of that said, this library was not designed with the specific case of a reverse proxy in mind, and I would guess that HAProxy, nginx, and reverse-proxy are all much better choices for the case of a "pure" reverse proxy. That is, if you want a reverse proxy whose only job is to take http://mysite.com/my/path and proxy it in a load balanced way to a set of backend servers that look like http://10.10.10.243/my/path, then you should almost definitely use one of those three options.

This library was designed more as a super custom way to make an API gateway type application. API gateways typically act like a reverse proxy for disparate sets of APIs. So, you might have http://mytwitterapi.com/my/path and http://myfacebookapi.com/my/path, but you want to expose both of them from one endpoint. This library would help you have something like http://myapi.com/twitter/my/path go to http://mytwitterapi.com/my/path, and http://myapi.com/facebook/my/path go to http://myfacebookapi.com/my/path. There are obviously a bunch of other use cases, as well, but it was specifically designed to allow a developer to create a proxy server with highly customized routing rules, logging, sniffing, etc.

softwareguy74 commented 4 years ago

Thanks for the GREAT explanation! Your library is actually what I need as I'm doing what you described with the Twitter and Facebook APIs with a single endpoint. I will be very interested to see the performance stats.

twitchax commented 4 years ago

Closing this.