apache / openwhisk-package-kafka

Apache OpenWhisk package for communicating with Kafka or Message Hub
https://openwhisk.apache.org/
Apache License 2.0
32 stars 43 forks source link

Fix binary encoding #270

Closed maximann closed 4 years ago

maximann commented 6 years ago

This addresses #269 Note: I'm not a python coder, so I'm sure there might be better ways to achieve this.

When encoding binary data (ascii string) as utf with variable length encoding only 7 bits are preserved. The 8th bit has a special meaning to indicate the variable encoding "continuation bit". This will of course corrupt any true binary data that has values larger than 127.

A secondary issue addressed in this MR relates to the base 64 encoding mechanism. The method called previously inserts newline characters in the encoded string every 76 characters, something that's not typically expected in newer encoding libraries. I've added a new flag which allows encoding without newline characters (which I assume is what most people will expect).

Finally, each message is encoded twice in the current implementation, once to retrieve the size and a second time to actually trigger the function.

This fix has the potential to break existing functions that may rely on the old behavior. I'm not sure how to address that and would appreciate feedback. A new parameter to trigger the fixed binary encoding may be necessary.

dubee commented 6 years ago

@abaruni, FYI.

dgrove-oss commented 4 years ago

closing as abandoned