zopefoundation / zope.publisher

Map requests from HTTP/WebDAV clients, web browsers, XML-RPC and FTP clients onto Python objects
Other
3 stars 12 forks source link

Revamp handling of query string and form decoding #66

Closed cjwatson closed 2 years ago

cjwatson commented 3 years ago

The previous approach was to tell underlying libraries to decode inputs using ISO-8859-1, then re-encode as ISO-8859-1 and decode using an encoding deduced from the Accept-Charset request header. However, this didn't make much conceptual sense (since Accept-Charset defines the preferred response encoding), and it made it impossible to handle cases where the encoding was specified as something other than ISO-8859-1 in the request (which might even be on a per-item basis, in the case of multipart/form-data input).

We now only perform the dubious Accept-Charset guessing for query strings; in other cases we let multipart determine the encoding, defaulting to UTF-8 as per the HTML specification. For cases where applications need to specify some other default form encoding, BrowserRequest subclasses can now set default_form_charset.

Fixes #65.

cjwatson commented 3 years ago

On Python 2 this exposed (via Launchpad's test suite) what I think is a bug in multipart, fixed by https://github.com/defnull/multipart/pull/36.