geosolutions-it / http-proxy

Lean and Mean HTTP Proxy written in Java
GNU General Public License v3.0
21 stars 31 forks source link

Slowdown when have big response body #44

Closed offtherailz closed 7 years ago

offtherailz commented 7 years ago

Example: body

<wfs:GetFeature service="WFS" version="1.1.0" xmlns:gml="http://www.opengis.net/gml" xmlns:wfs="http://www.opengis.net/wfs" xmlns:ogc="http://www.opengis.net/ogc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wfs http://schemas.opengis.net/wfs/1.1.0/wfs.xsd"><wfs:Query typeName="lhtac:web2014all_mv" srsName="EPSG:4326"><ogc:Filter><ogc:And><ogc:Intersects><ogc:PropertyName>geom3857</ogc:PropertyName><gml:Polygon srsName="EPSG:4326"><gml:exterior><gml:LinearRing><gml:posList>-117.18017578125 47.4057852900587 -117.18017578125 47.60616304386874 -116.60888671874999 47.60616304386874 -116.60888671874999 47.4057852900587 -117.18017578125 47.4057852900587</gml:posList></gml:LinearRing></gml:exterior></gml:Polygon></ogc:Intersects><ogc:Intersects><ogc:PropertyName>geom3857</ogc:PropertyName><ogc:Function name="collectGeometries"><ogc:Function name="queryCollection"><ogc:Literal>lhtac:itd_districts</ogc:Literal><ogc:Literal>the_geom</ogc:Literal><ogc:Literal>ITD_Dist_n IN ('1')</ogc:Literal></ogc:Function></ogc:Function></ogc:Intersects></ogc:And></ogc:Filter></wfs:Query></wfs:GetFeature>

To this URL : http://demo.geo-solutions.it/geoserver/ows?service=WFS&outputFormat=application/json& thakes 247ms

image

Using the proxy: http://lhtac.geo-solutions.it/lhtac-webgis/proxy/?url=http%3A%2F%2Fdemo.geo-solutions.it%2Fgeoserver%2Fows%3Fservice%3DWFS%26outputFormat%3Dapplication%2Fjson

Takes 6156 ms: image

This is the full HTTP request:

POST /lhtac-webgis/proxy/?url=http%3A%2F%2Fdemo.geo-solutions.it%2Fgeoserver%2Fows%3Fservice%3DWFS%26outputFormat%3Dapplication%2Fjson HTTP/1.1
Host: lhtac.geo-solutions.it
Content-Type: application/xml
Accept: application/json
Cache-Control: no-cache
Postman-Token: b9978abf-28a6-fc83-9157-d3546c6c7fc9

<wfs:GetFeature service="WFS" version="1.1.0" xmlns:gml="http://www.opengis.net/gml" xmlns:wfs="http://www.opengis.net/wfs" xmlns:ogc="http://www.opengis.net/ogc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wfs http://schemas.opengis.net/wfs/1.1.0/wfs.xsd"><wfs:Query typeName="lhtac:web2014all_mv" srsName="EPSG:4326"><ogc:Filter><ogc:And><ogc:Intersects><ogc:PropertyName>geom3857</ogc:PropertyName><gml:Polygon srsName="EPSG:4326"><gml:exterior><gml:LinearRing><gml:posList>-117.18017578125 47.4057852900587 -117.18017578125 47.60616304386874 -116.60888671874999 47.60616304386874 -116.60888671874999 47.4057852900587 -117.18017578125 47.4057852900587</gml:posList></gml:LinearRing></gml:exterior></gml:Polygon></ogc:Intersects><ogc:Intersects><ogc:PropertyName>geom3857</ogc:PropertyName><ogc:Function name="collectGeometries"><ogc:Function name="queryCollection"><ogc:Literal>lhtac:itd_districts</ogc:Literal><ogc:Literal>the_geom</ogc:Literal><ogc:Literal>ITD_Dist_n IN ('1')</ogc:Literal></ogc:Function></ogc:Function></ogc:Intersects></ogc:And></ogc:Filter></wfs:Query></wfs:GetFeature>
simboss commented 7 years ago

Can we get the value for the defaultStreamByteSize config param?

Is it set?

The default does not seem reasonable to me (1KB) and it might slown down large responses parsing.

mbarto commented 7 years ago

I think this is part of (or all) the problem:

https://github.com/geosolutions-it/http-proxy/blob/32dff1d6d8bcc95d320d0938c139a4b33a212734/src/main/java/it/geosolutions/httpproxy/HTTPProxy.java#L789-L797

Basically we read all the response in memory and then we flush it to the servlet output stream. I would simply remove the bytearrayoutputstream usage.

simboss commented 7 years ago

Good catch Mauro.

I think we need to:

offtherailz commented 7 years ago

Applying the fixes I noticed that the issue not visible in my local env. I suppose the issue is visible in production because of some bottleneck in networking from the http-proxy point of view (It can be reading the response or sending it back to the client) that can be anyway multiplied by the issue I fixed.

Compared results with local proxy image

So not any appreciable difference between normal and proxy request.

note: The response starts in late for the proxy version.

Proxy response timing without fix : image

Of course the same for the fixes applied: image

(note: of course we can not see the difference between normal request and proxy request, so we don't expect any improvement from our changes in local tests.

But it starts to write the response before: image

I suggest to add my changes anyway, to improve the response timing, and then see it they reduce the problem, what do you think?

note I changed the buffer size in local, but of course nothing changed, I will not commit it, I'd like to do some tests in dev env instead, to see if it improves the performances too.

simboss commented 7 years ago

Well, the writing above is not extremely clear to me.

Anyway, I would go ahead, incorporate my suggestions and do the merge.

With large responses it is a waste of memory to load everything before proxying!