Closed yusefnapora closed 8 years ago
So the issue is that we end up with major type 2 instead of 3? As far as I can tell the only difference is that 3 assumes an encoding whereas 2 does not, so we can fall back to decoding 2 as unicode as well, right?
@yusefnapora can you confirm and I'll merge?
ah right. yeah that is the only difference. I updated the go client to try to decode byte strings as utf-8 if possible, but would be good to get this merged in
Playing with the go reader client, I discovered that we've been writing the translator id as a byte string instead of a unicode string to the cbor records. My fault, since I should have been returning a unicode string from
Translator.versioned_id
. This isn't obvious when printing the records in the python client, since they both get printed as strings. In go, the byte strings are printed as slices with hex digits for each code point.This makes sure we return unicode from that method, and, just in case, wraps the output in
unicode()
before writing.