vivithemage / mrisa

MRISA - Meta reverse image search api
http://mrisa.mage.me.uk/
GNU General Public License v2.0
264 stars 63 forks source link

Unicode in the json file #10

Closed jimlynnjulian closed 6 years ago

jimlynnjulian commented 6 years ago

I discovered some problems in the json files I created from returned data using a modified form of the 'client.py' file. All json files begin with the letter 'b' and a single quote. Each file ends with a single quote. In between are several unicode characters and escaped characters. Examples are: \' \ u\00a0 u\00d7 If the iinitial b and single quote are removed, along with the trailing quote, all '\' become '\' and all escaped apostrophes have the back slash removed, then the file is accepted as a json file. Question is, can these chartacters be removed easily? I'm only vaguely familiar with python string types and unicode, ASCII, and byte conversions. I could write a script to remove the examples, but is that all that will ever be ever found?

jimlynnjulian commented 6 years ago

Discovered the file has to be written in binary which, in turn, requires a 'wb' instead of a 'w' specifier when opening the file. Otherwise, python writes unicode by default with the binary indicator prepended ('b').