tornadoweb / tornado

Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
http://www.tornadoweb.org/
Apache License 2.0
21.7k stars 5.5k forks source link

HTTPHeaders.get_list() improve functionality #1094

Open liamcoau opened 10 years ago

liamcoau commented 10 years ago

Currently if in the headers the Accept header had a value of text/*, text/html, text/html;level=1, */*, then self.request.headers.get_list("Accept") would return: ["text/*, text/html, text/html;level=1, */*"]

The documentation for get_list says: "Returns all values for the given header as a list."

When I originally read this I thought the multiple accept types would be broken up and all added to a list such as: ["text/*", "text/html", "text/html;level=1", "*/*"]

I think that this would be much more useful and make more sense, or at least it would be in the project I'm working on.

I achieved the same effect by the following code:

response_type = self.request.headers.get_list("Accept")
response_type = [item.strip() for index in range(len(response_type)) for item in response_type[index].split(",")]

Also it would be helpful if the documentation was explicit about what's returned if the header name wasn't given.

bdarnell commented 10 years ago

That's a good point. The reason for the current behavior is that I've always looked at this from the opposite perspective: a few headers (notably Set-Cookie) erroneously contain embedded commas, so it is not appropriate to take distinct header lines and join them with commas. get_list() was introduced as a way to cleanly get the multiple Set-Cookie headers as they were originally presented. For headers like Accept, the older way of looking up the joined singular value and splitting that still works (so your last example could be response_type = [i.strip() for i in self.request.headers.get("Accept").split(',')]), whether it was sent as a single header line or multiple. I'm not sure if we should treat this as a documentation issue and make it clear that get_list is designed for cases where split doesn't work, or if we should introduce some way to get the headers split by the framework.

spaceone commented 3 years ago

so your last example could be response_type = [i.strip() for i in self.request.headers.get("Accept").split(',')]) That's wrong. There can be quoted ,.

If you are searching for a correct way to split all headers use https://github.com/spaceone/httoop :

>>> import httoop
>>> h = httoop.Headers()
>>> h['Accept'] = 'text/*; q=0.3, text/html;q=0.9, text/html;level=1, */*'
>>> h.elements('Accept')
[<Accept('text/html', {b'level': '1'})>, <Accept('*/*')>, <Accept('text/html', {b'q': b'0.9'})>, <Accept('text/*', {b'q': b'0.3'})>]
>>> h.parse(b'Set-Cookie: foo=bar;\r\nSet-Cookie: bar=baz; max-age=1')
>>> h.elements('Set-Cookie')
[<Set-Cookie('foo=bar')>, <Set-Cookie('bar=baz', {b'max-age': '1'})>]
>>> h['Accept'] = 'text/html; foo=","; q=0.3, text/*'
>>> h.elements('Accept')
[<Accept('text/*')>, <Accept('text/html', {b'foo': ',', b'q': b'0.3'})>]