geopython / pywps

PyWPS is an implementation of the Web Processing Service standard from the Open Geospatial Consortium. PyWPS is written in Python.
https://pywps.org
MIT License
178 stars 117 forks source link

FileSizeExceededError causes connection reset in gunicorn #596

Open aulemahal opened 3 years ago

aulemahal commented 3 years ago

Description

Might not be the best issue title, I'm not quite sure what is happening myself. When I "push" a file that is too large as an input, the "File size for input exceeded" message is readable when I am using the basic server (a pywps Service through werkzeug.serving.run_simple). However, when using gunicorn, the connexion is reset and the client fails with ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')), coming from the request package.

I have another setup, where the pywps service is launched the same way (with gunicorn), but is also behind nginx and twitcher (see comment below). In that case, I did receive a proper Exception xml, but not the good one:

<Exception exceptionCode="NoApplicableCode" locator="NotAcceptable">
        <ExceptionText>Request failed: (&#x27;Connection aborted.&#x27;, BrokenPipeError(32, &#x27;Broken pipe&#x27;))</ExceptionText>
</Exception>

Note that this other server is on another machine, while my first simple setup has the client and server on the same machine. Also, I am able to bypass nginx and twitcher on the second setup, which gave me the exact same error as my simple local setup : ConnectionResetError.

I am in fact using finch (https://github.com/bird-house/finch) as the pyWPS Service. The simple run_simple server is actually taken from emu (https://github.com/bird-house/emu/). I am using birdy and owslib for the client side.

Environment

Steps to Reproduce

This is the simplest way I can think of to reproduce the bug, but it involves emu and birdy. Sorry I am still a noob in this wps thing.

Server. With emu installed, in a terminal:

emu start --maxsingleinputsize 1mb

Client. With birdy installed, in a python session:

from birdy import WPSClient

wps = WPSClient('http://127.0.0.1:5000')
wps.inout(dataset="path/to/a/netcdf/file/larger/than/a/few/MBs")

Additional Information

(*) @tlvu Can you give basic details on how your dev server is different from me calling gunicorn --bind=0.0.0.0:5000 finch.wsgi:application in a terminal? EDIT: Added some info relating to that different setup.

tlvu commented 3 years ago

(*) @tlvu Can you give basic details on how your dev server is different from me calling gunicorn --bind=0.0.0.0:5000 finch.wsgi:application in a terminal?

@aulemahal

My dev server deployed using the full docker-compose PAVICS stack here https://github.com/bird-house/birdhouse-deploy/tree/master/birdhouse so Finch is behind Nginx and Twitcher (an access control proxy).

To hit Finch on my dev server directly (bypass Nginx and Twitcher), use http://lvupavicsmaster.ouranos.ca:8095/wps, example:

$ curl --silent --include "http://lvupavicsmaster.ouranos.ca:8095/wps?service=WPS&version=1.0.0&request=GetCapabilities" | head
HTTP/1.1 200 OK
Server: gunicorn
Date: Thu, 01 Apr 2021 21:04:50 GMT
Connection: close
Content-Type: text/xml
Content-Length: 126875

<?xml version="1.0" encoding="UTF-8"?>
<!-- PyWPS 4.4.0 -->
<wps:Capabilities service="WPS" version="1.0.0" xml:lang="en-US" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 ../wpsGetCapabilities_response.xsd" updateSequence="1">

Let me know if you still have gunicorn behavior difference between my dev server and the one you launch locally on your machine.

Otherwise, it might comes down to version difference (version of gunicorn, version of the various libraries installed) or configuration difference (pywps config, wsgi config, gunicorn config).

aulemahal commented 3 years ago

Thanks @tlvu. I edited the top comment with the new information. Summary: bypassing nginx and twitcher gives the same "ConnectionResetError" on the client side. So with nginx+twitcher it seems like the broken connection results in a nicer "BrokenPipeError" on the server side.