tornadoweb / tornado

Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
http://www.tornadoweb.org/
Apache License 2.0
21.7k stars 5.5k forks source link

add unicode URLSpec support #89

Open ghost opened 14 years ago

ghost commented 14 years ago

Please add unicode URLSpec support.

Patch:

diff --git a/tornado/web.py b/tornado/web.py
index e51948a..cd9a6a4 100644
--- a/tornado/web.py
+++ b/tornado/web.py
@@ -1368,7 +1368,7 @@ class URLSpec(object):
         """
         if not pattern.endswith('$'):
             pattern += '$'
-        self.regex = re.compile(pattern)
+        self.regex = re.compile(pattern, re.UNICODE)
         self.handler_class = handler_class
         self.kwargs = kwargs
         self.name = name
bdarnell commented 14 years ago

Does that work? I think there's more work to be done to support non-ascii paths than just redefining the regex character classes. For example, it is my understanding that browsers may send non-ascii paths as percent-encoded utf-8, and tornado does not currently make any attempt to decode them.

ghost commented 14 years ago

I'm not sure it works well.

I'm deprecated normal name/Unicode String as key/argument

/obj_type/name

I'm use uuid instead,

/obj_type/uuid 
bdarnell commented 10 years ago

From recent discussion on the mailing list: The rules for decoding these urls correctly are complicated and are given in RFC 3987: http://tools.ietf.org/html/rfc3987#section-3.2. We need an implementation of this algorithm and a way for applications to opt in to it (since it would be slow and it is common practice even for non-english sites to restrict the routing-critical part of their url paths to ascii).