Closed mats-swissphone closed 1 year ago
Thanks for the report! Compatibility with other implementations is important to me, so I'll try fix this ASAP and publish a new release.
I haven't ever used binary / base64 data with Python's xmlrpc module, so to make sure I correctly understand the problem: When encoding binary data as base64, the Python implementation inserts newlines after multiples of 76 characters - but no other whitespace, or carriage return characters? The RFC linked from the Python docs indicates that CRLF should used ... so, does the Python implementation not even follow the rules from the RFC it's linking to? :)
In this case, I agree, normalizing the input string by stripping newline characters (\n
, and maybe \c
for good measure) before attempting to decode would be a good idea.
Thanks for the quick response!
You are probably referring to the following in the RFC:
In particular, text line breaks must be converted into CRLF sequences prior to base64 encoding.
I believe this applies to encoding textual data as base64, because it says prior to encoding. So this does not apply in our case here.
The following paragraph however is important:
The encoded output stream must be represented in lines of no more than 76 characters each. All line breaks or other characters not found in Table 1 must be ignored by decoding software. In base64 data, characters other than those in Table 1, line breaks, and other white space probably indicate a transmission error, about which a warning message or even a message rejection might be appropriate under some circumstances.
As I interpret this, the decoder should therefore:
\n
and/or \r
) and other whitespace (ASCII only, or even Unicode?),Python will indeed insert newlines \n
every 76 output bytes: source, detailed docs.
So ignoring just \n
would work for the Python client, but the RFC suggests to ignore more.
The decision is yours :)
Ok, that sounds like dropping \n
, \r
, and ASCII whitespace (i.e. ` and
\t`) would be a good middle ground between "not handling line breaks at all" and "stripping everything that's not in the base64 alphabet" (since the latter would likely cause invalid input to become valid). I'll look into implementing that tomorrow.
I pushed a fix for this and released v0.6.1: https://crates.io/crates/dxr/0.6.1
@decathorpe
Thanks! Works.
Hi. Great library! We'd like to use it for our internal projects.
We have stumbled upon an error when sending binary (bytes/base64) values via Python xmlrpc.client, which encodes base64 with newlines.
The following quickfix helped in our case:
A proper upstream fix would be much appreciated!
Hope this helps & Cheers
(Edit: fixed the link)