bottlepy / bottle

bottle.py is a fast and simple micro-framework for python web-applications.
http://bottlepy.org/
MIT License
8.44k stars 1.46k forks source link

UTF8 path string invalid when using app.mount() #602

Open onny opened 10 years ago

onny commented 10 years ago

test.py:

#!/usr/bin/python
import bottle
import testapp

bottle.debug(True)
app = bottle.Bottle()

app.mount('/test',testapp.app)

app.run(reloader=True, host='0.0.0.0', port=8080)

run(host="localhost",port=8080)

testapp.py:

import bottle

app = bottle.Bottle()

@app.route("/:category", method=["GET","POST"])
def admin(category):
    try:
        return category
    except Exception(e):
        print ("e:"+str(e))

Trying to access: http://127.0.0.1:8080/test/äöü results in following error:

Error: 400 Bad Request
Invalid path string. Expected UTF-8

Running Python 3.4.0 with python-bottle 0.12.5.

onny commented 10 years ago

We have to notice that UTF8 path is generally working: test_working.py:

#!/usr/bin/python
# -*- coding: utf-8 -*-

import bottle
import testapp

bottle.debug(True)
app = bottle.Bottle()

@app.route("/test/:category", method=["GET","POST"])
def admin(category):
    try:
        return category
    except Exception(e):
        print ("e:"+str(e))

app.run(reloader=True, host='0.0.0.0', port=8080)

run(host="localhost",port=8080)

Visiting http://127.0.0.1:8080/test/äöü prints the special chars without any problems :/

onny commented 10 years ago

app.mount() accepts special characters if I uncomment/ignore this exception:

    def _handle(self, environ):
        path = environ['bottle.raw_path'] = environ['PATH_INFO']

        if py3k:
            try:
                environ['PATH_INFO'] = path.encode('latin1').decode('utf8')
            except UnicodeError:
                print("unicode error")
                # return HTTPError(400, 'Invalid path string. Expected UTF-8')
defnull commented 6 years ago

The actual bug (or bad design decision) seems to be that bottle overwrites environ['PATH_INFO'] in Bottle._handle() with a re-encoded value, which breaks the WSGI spec for mounted WSGI apps, including bottle itself. The mounted app will try to re-encode the already re-encoded string again, assuming it came from a valid WSGI environment.

Bottle should not change environ['PATH_INFO'] at all, but instead re-encode the path in Request.path() on demand, and use that for request matching. I'm not sure if this change might break existing applications, but this may be a hard enough bug (breaking WSGI spec) that a backwards incompatible fix would be justifiable.

sharpaper commented 5 years ago

This bug is 5 years old and still exists. It creates problems with UTF-8 URL because Bottle returns 400. Can somebody knowledgeable of Bottle please look into fixing this? Thank you.

sharpaper commented 5 years ago

To reproduce:

import bottle
from bottle import get

@get("/<name>")
def index(name):
    return name

go to http://localhost/%E8

result:

Error: 400 Bad Request
Sorry, the requested URL 'http://localhost/%C3%A8' caused an error:
Invalid path string. Expected UTF-8
sharpaper commented 5 years ago

This issue seems related to this other one: https://github.com/bottlepy/bottle/issues/792 It looks like both are fixed by the new version 0.13-dev. Would it be possible please to merge these Unicode fixes to the next release, instead of waiting for the release of the whole 0.13 version? Unicode mishandling are a major bug and stopper. Thanks.

defnull commented 5 years ago

Since you are not mounting or redirecting, your issue is not related to this bug or #792. Please open a new issue. Also #792 seems to be fixed by now.

defnull commented 5 years ago

Also, your exact example works fine for valid utf-8 strings, encoded or not. %E8 is not a valid UTF-8 string, so the error message is quite accurate.

defnull commented 5 years ago

The original issue (mounting apps) is still present, though. Pull requests are welcomed.

nobrin commented 4 years ago

I faced this bug on my app too. In my case, I avoid the bug as follows and the app works fine. Under, Python 3.6.8 + Bottle.py 0.12.18

#!/usr/bin/env python3
import functools
import bottle

def pathinfo_adjust_wrapper(func):
    # A wrapper for _handle() method
    @functools.wraps(func)
    def _(environ):
        environ["PATH_INFO"] = environ["PATH_INFO"].encode("utf8").decode("latin1")
        return func(environ)
    return _
api = bottle.Bottle()
api._handle = pathinfo_adjust_wrapper(api._handle)

@api.route("/<path:path>")
def callback(path):
    return {"name": path}

application = bottle.default_app()
application.mount("/api/", api)

application.run(host="0.0.0.0")

Access to URL "/api/日本語/filename" (this is in Japanese).

$ curl 127.0.0.1:8080/api/%E6%97%A5%E6%9C%AC%E8%AA%9E/filename
{"name": "\u65e5\u672c\u8a9e/filename"}

It seems to OK. This way does not need any code change on bottle.py.

In this code, the pathinfo_adjust_wrapper() re-encode and re-decode PATH_INFO as inverse of the original _handle(). So, this avoid the UnicodeError.

I love :sparkling_heart: the bottle.py framework for developing web apps. Thank you!