niklasb / webkit-server

[not actively maintained] The C++ webkit-server from capybara-webkit with useful extensions and Python bindings
MIT License
48 stars 38 forks source link

Client.headers() now returns a dict, with the key being the response header and the value being a list of response values. #14

Closed Rolandde closed 9 years ago

Rolandde commented 9 years ago

You changed the Client.headers() method today in response to my bug report.

You changed it to return a list of tupples (originally returns a dict of strings). I think it might be better to continue returning a dict, as users generally want to isolate specific headers. I rewrote it so that it returns a dict, with the value being a list. I think it is easier for the user to deal with accessing list values than iterating over the list, testing for the equality of the first element to their desired response header.

niklasb commented 9 years ago

This is on purpose, because a dict does not appropriately model HTTP headers. In particular, according to the HTTP specs, there can be multiple instances of the same header, e.g. Set-Cookie.

In case you prefer the old behavior, just use dict(Client.headers()) instead.

niklasb commented 9 years ago

Oh I just see that your change actually also proposes different semantics that resolve the old issue. I will think about it some more.

Rolandde commented 9 years ago

Using dict(Client.headers()) with the current code, which returns a list of tupples, would re-introduce a bug. It would arbitrarily remove instances of multiple headers. As an example:

l = [('a', 'a1'), ('b', 'b1'), ('a', 'a2')]
d = dict(l)
d
>>> {'a': 'a1', 'b': 'b1'}
niklasb commented 9 years ago

That was just a quick suggestion for when you want the old behaviour back, when there is only one instance of each header (e.g. dict(sess.headers())['content-type']). This covers 99,9% of cases. I will slightly change the semantics so that the keys are normalized to lower-case.

When you want to get your suggested list, you can do [y for x,y in sess.headers().items() if x == 'set-cookie']. That's a bit of boilerplate, but it does not seem to be a use case common enough to optimize for. I think it's better to keep it simple, what do you think?

On Sat, Oct 18, 2014 at 5:08 PM, Rolandde notifications@github.com wrote:

Using dict(Client.headers()) with the current code, which returns a list of tupples, would re-introduce a bug. It would arbitrarily remove instances of multiple headers. As an example:

l = [('a', 'a1'), ('b', 'b1'), ('a', 'a2')]d = dict(l)d>>> {'a': 'a1', 'b': 'b1'}

— Reply to this email directly or view it on GitHub https://github.com/niklasb/webkit-server/pull/14#issuecomment-59617616.

Rolandde commented 9 years ago

Here are my thoughts:

I will slightly change the semantics so that the keys are normalized to lower-case.

Please don't do that. HTML header strings are standardized (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html) and making them lower-case breaks that standard. I always expect the header string to be 'User-Agent', never 'user-agent'.

Your suggestion of using list comprehension will certainly work. I still like the dictionary approach as I feel it is more useful and consistent. Useful in the sense that most times I call headers() is because I want to find a specific header. The value being a list makes that every so slightly more annoying, but it makes it very explicit that multiple headers of the same name could be returned. Both changes will break code from previous versions, but I feel the dictionary approach is easier to re-factor. Knowing that dict(sess.headers())['Content-Type']`) may drop elements is far more obscure and bugs caused by this will be trickier to track down.

All the best.

niklasb commented 9 years ago

Thanks for the fruitful discussion. I'm closing this because I feel like the current version has quite reasonable behavior in this regard. Header names are now normalized to per-segment capitalized form (i.e. User-Agent).