Closed user185953 closed 5 years ago
WBXML fields are binary in general. Binary data with the highest bit set are corrupted by PyWBXMLDecoder. This patch changes the mode of corruption to preserve UTF-8 and escape non-UTF-8
This looks pretty good to me @davidpshaw. This is basically how we do it in Boxer.
Thank you for the review, @iragsdale. The other commit I just pushed will be trickier. Bytes().hex() looks odd, CDATA with hex-encoded data is probably not right and output format changes. The general idea, however, looks OK?
@user185953 still need to put up a PR into mitmproxy -- this project isn't brought in as a submodule there, it's duplicated.
Based on https://github.com/cisiqo/PyWBXML/commit/770ad4cfd18d631f889d730d8b4c4903a6fb026e The "errors='backslashreplace'" bit is there, because in mitmproxy the priority is seeing every byte