arykov / reverseproxy

Reverse proxy based on littleproxy
12 stars 3 forks source link

Add content filtering capability #1

Open maximdim opened 10 years ago

maximdim commented 10 years ago

Would be very useful if there would be a way to specify transformation/filtering of response body.

Use case: reverse proxying web service which returns JSON but only small subset of fields are actually needed. Adding filter which would parse json, remove unneeded fields and send much smaller result back to the requester.

arykov commented 10 years ago

Seems like a potentially useful feature. There are two problems with doing it within current implementation: 1) Say, if proxy is configured to direct stephenfry.com to 1.2.3.4 and hughlaurie.net to 1.2.3.4 there is currently no way to tell on the way back that original request was sent to stephenfry.com and not hughlaurie.net. This would make applying filter selectively based on the host:port impossible. The solution I see to this is putting responsibility on the filter implementer to modify outgoing request to make it identifiable at response time. Adding headers on the outgoing request that would include original host/port. Unfortunately these headers would travel all the way to the implementing WebServer. 2) I don't think man in the middle has been implemented for https2https scenario. This would preclude us from doing request/response modification for this scenario.

Do you have specific timelines you need this for?

maximdim commented 10 years ago

In my case I need very simple(istic) reverse proxy just to distribute load across many slave servers and filter out unused responses to reduce data flow back to requester. All requests are going from single client to single server over http so both 1) and 2) won't be applicable in my case. Start timple and then expand, if needed perhaps? :)

arykov commented 10 years ago

That is simple enough. But just to make sure that it is a good fit for whatever you are doing keep in mind two things: 1) It does not load balance. It can switch where request needs to be routed to on the fly but somebody needs to make an API call to achieve this. 2) It is a reverse proxy implemented as an HTTP proxy. So you would not be able to put it in front of multiple servers as you would be able with Apache, Nginx etc.

The use case for this proxy is when I don't control or don't want to bother with DNS(hence http proxy based implementation) and I want to predictably hit the same endpoint unless I explicitly switch it.

Let me know if this is what you need.