zalando-zmon / zmon-worker

ZMON Python Worker
https://zmon.io/
Other
19 stars 41 forks source link

added safe check for unicode types and unit testing #408

Closed lerovitch closed 5 years ago

lerovitch commented 5 years ago

Problem

Encodings in python 2 have always been pretty messy due to implicit conversion done behind the curtains. Essentially there are two types of objects: str and unicode. The former are just bytes, while the latter contain the metadata needed.

When we have in our code something like:

key = str(key)

python internally performs the following:

if it is a str type it is an idempotent function, since bytes are bytes. if it is a unicode type it makes an implicit conversion to ascii. Which is risky since the text contained in there might be not ascii.

Impact

the function prometheus_flat() is using this flatten utility, and it fails whenever the text contains non-ascii characters

Solution

I have strived for doing an explicit conversion only if it is needed.

vetinari commented 5 years ago

:+1:

alexkorotkikh commented 5 years ago

👍