zopefoundation / zope.app.publisher

Implementations and means for configuration of Zope 3-style views and resources.
Other
1 stars 5 forks source link

Encoding handling in XML-RPC testing infrastructure seems to be broken on Python 3 #11

Closed icemac closed 4 years ago

icemac commented 4 years ago

ZopeTestTranport does a decode of the body with utf-8 to convert it from bytes to str: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L79-L81

Later on FakeSocket converts the body str back to bytes using latin-1: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L24-L25

This breaks inside the expat parser because it is not able to handle chars outside ASCII which are latin-1 encoded as the default encoding for XML-RPC seems to be utf-8.

I propose to omit converting of the HTTP body and instead using the raw data to feed the parser.

d-maurer commented 4 years ago

Michael Howitz wrote at 2020-4-17 00:57 -0700:

ZopeTestTranport does a decode of the body with utf-8 to convert it from bytes to str: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L79-L81

Later on FakeSocket converts the body str back to bytes using latin-1: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L24-L25

This breaks inside the expat parser because it is not able to handle chars outside ASCII which are latin-1 encoded as the default encoding for XML-RPC seems to be utf-8.

I propose to omit converting of the HTTP body and instead using the raw data to feed the parser.

A byte (i.e. an integer between 0 and 255) is naturally represented by the unicode character with its value as unicode codepoint.

Based on this representation, it is natuaral to convert a bytes object to str via bytes.decode("latin1") and revert the conversion via str.encode("latin-1").

Therefore, I think that it is ZopeTestTransport which does the wrong thing, not FakeSocket.