Closed icemac closed 4 years ago
Michael Howitz wrote at 2020-4-17 00:57 -0700:
ZopeTestTranport
does a decode of the body withutf-8
to convert it frombytes
tostr
: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L79-L81Later on
FakeSocket
converts the bodystr
back tobytes
usinglatin-1
: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L24-L25This breaks inside the expat parser because it is not able to handle chars outside ASCII which are
latin-1
encoded as the default encoding for XML-RPC seems to beutf-8
.I propose to omit converting of the HTTP body and instead using the raw data to feed the parser.
A byte (i.e. an integer between 0 and 255) is naturally represented by the unicode character with its value as unicode codepoint.
Based on this representation, it is natuaral to convert a bytes
object
to str
via bytes.decode("latin1")
and revert the conversion
via str.encode("latin-1")
.
Therefore, I think that it is ZopeTestTransport
which does
the wrong thing, not FakeSocket
.
ZopeTestTranport
does a decode of the body withutf-8
to convert it frombytes
tostr
: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L79-L81Later on
FakeSocket
converts the bodystr
back tobytes
usinglatin-1
: https://github.com/zopefoundation/zope.app.publisher/blob/19ee5193e1bd42e9e69dc144b76e11abfb4da773/src/zope/app/publisher/xmlrpc/testing.py#L24-L25This breaks inside the expat parser because it is not able to handle chars outside ASCII which are
latin-1
encoded as the default encoding for XML-RPC seems to beutf-8
.I propose to omit converting of the HTTP body and instead using the raw data to feed the parser.