When trying to create an mmdb with non-ascii characters, the file produced cannot be read. It's like the offsets are wrong..
I think it's because the offset written to file assume that the python string length is the same as the output bytes when a string is encoded to utf-8.
When trying to create an mmdb with non-ascii characters, the file produced cannot be read. It's like the offsets are wrong..
I think it's because the offset written to file assume that the python string length is the same as the output bytes when a string is encoded to utf-8.
Setting the length from the encoded string seems to produce the correct result at https://github.com/cloudflare/py-mmdb-encoder/blob/master/mmdbencoder/__init__.py#L346
length = len(value.encode('utf-8'))