cherrypy / cherrypy

CherryPy is a pythonic, object-oriented HTTP framework. https://cherrypy.dev
https://docs.cherrypy.dev
BSD 3-Clause "New" or "Revised" License
1.8k stars 357 forks source link

cherrypy.tools.static loses with a TypeError in httputil.get_ranges() when asked to serve a range from a fileobj representing a zipfile entry #2024

Open tarunik opened 2 months ago

tarunik commented 2 months ago

I'm submitting a ...

Do you want to request a feature or report a bug?

This is very much a bug -- forcing zipapps to unzip themselves to get at embedded static resources simply isn't possible in some deployment scenarios. (i.e. there's nowhere the user the app is running as can write to, for reasonably good security reasons). Also, what's the point of cherrypy.tools.static.serve_fileobj if what you serve has to be on the filesystem to begin with?

What is the current behavior?

A 500 with a backtrace from httputil.get_ranges() raising a "TypeError: '>=' not supported between instances of 'int' and 'NoneType'"

If the current behavior is a bug, please provide the steps to reproduce and if possible a screenshots and logs of the problem. If you can, show us your code.

Reproduction instructions (with a mildly redacted traceback)

$ python3 pieface.zip & # using the attached pieface.zip
$ curl -r 0-10 http://localhost:8080
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
    <title>500 Internal Server Error</title>
    <style type="text/css">
    #powered_by {
        margin-top: 20px;
        border-top: 2px solid black;
        font-style: italic;
    }

    #traceback {
        color: red;
    }
    </style>
</head>
    <body>
        <h2>500 Internal Server Error</h2>
        <p>The server encountered an unexpected condition which prevented it from fulfilling the request.</p>
        <pre id="traceback">Traceback (most recent call last):
  File "~/wsgizip/pieface.pyz/cherrypy/_cprequest.py", line 638, in respond
    self._do_respond(path_info)
  File "~/wsgizip/pieface.pyz/cherrypy/_cprequest.py", line 697, in _do_respond
    response.body = self.handler()
                    ^^^^^^^^^^^^^^
  File "~/wsgizip/pieface.pyz/cherrypy/lib/encoding.py", line 223, in __call__
    self.body = self.oldhandler(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/wsgizip/pieface.pyz/cherrypy/_cpdispatch.py", line 54, in __call__
    return self.callable(*self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/wsgizip/pieface.pyz/pieface/app.py", line 10, in index
    return serve_fileobj(indexfile)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/wsgizip/pieface.pyz/cherrypy/lib/static.py", line 183, in serve_fileobj
    return _serve_fileobj(fileobj, content_type, content_length, debug=debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/wsgizip/pieface.pyz/cherrypy/lib/static.py", line 197, in _serve_fileobj
    r = httputil.get_ranges(request.headers.get('Range'), content_length)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/wsgizip/pieface.pyz/cherrypy/lib/httputil.py", line 93, in get_ranges
    if start &gt;= content_length:
       ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '&gt;=' not supported between instances of 'int' and 'NoneType'
</pre>
    <div id="powered_by">
      <span>
        Powered by <a href="http://www.cherrypy.dev">CherryPy 18.9.0</a>
      </span>
    </div>
    </body>
</html>

demonstration zipapp, created using cp -R of the sources shown here then a pip install of cherrypy into the builddir followed by invoking zipapp from the command line pieface.zip

source code:

$ cat pieface/pieface/app.py 
import cherrypy
from cherrypy.lib.static import serve_fileobj
from importlib.resources import files

indexfile = files(__spec__.parent).joinpath("index.html").open("r")

class Plop:
    @cherrypy.expose
    def index(self):
        return serve_fileobj(indexfile)

def main():
    cherrypy.quickstart(Plop(), '/')

if __name__ == "__main__":
    main()

and for good measure:

$ cat pieface/pieface/__init__.py 
from .app import main

index.html I am serving:

$ cat pieface/pieface/index.html
<html>
<head>
<title>Plop!</title>
</head>
<body>
This static content goes plop.  And we managed to avoid a pie to the face while we were at it.
</body>
</html>

What is the expected behavior?

200 OK and I get the index.html inside the attached pieface.pyz back (i.e. it ignores the Range header on something that's not a real filesystem object), or 206 Partial Content and I get the first ten bytes of that index.html back (i.e. it's able to figure out the size in a way that isn't reliant on the file-like having a fileno that can be fed to fstat())

What is the motivation / use case for changing the behavior?

This would allow CherryPy's static content handler to be used to serve static files out of a zipapp, which would lend itself to implementing a WAR-like deployment story for Python webapps (unfortunately, mod_wsgi's archaic method of loading scripts is also a barrier to this as it can't load zip-anythings).

Please tell us about your environment:

Other information (e.g. detailed explanation, stacktraces, related issues, suggestions how to fix, links for us to have context, e.g. stackoverflow, gitter, etc.)

httputil.get_ranges() probably should return None instead of barfing if passed None for a content_length

webknjaz commented 1 month ago

That's an interesting idea!

Also, what's the point of cherrypy.tools.static.serve_fileobj if what you serve has to be on the filesystem to begin with?

Looks like people anticipated problems in the past: https://github.com/cherrypy/cherrypy/commit/8df37a4222595e147a43d4e7664da423bdb4fced.

  • Cheroot version: 10.x (shipped with CherryPy 18.9.0)

Sounds like you were using 10.0.0 at the time of posting. It's not bundled but is pulled in as a dependency. You really have to answer the questions accurately to reduce the burden of guessing. The last one is 10.0.1 as of a few days ago.

Anyway.. If you could contribute a pull request with a regression test for this case, it'd be helpful. Contributing a fix would be welcome as well.