linz / lds-metadata-updater

For downloading, editing and then setting LDS metadata
0 stars 2 forks source link

encoding error with windows for specific dataset #19

Closed SPlanzer closed 6 years ago

SPlanzer commented 6 years ago

FYI @iharrison

encoding error with windows for specific dataset (51932)

issue did not arise in dev with ubuntu machine. Script forces utf-8

ubuntu machine encoding:

import locale
print(locale.getpreferredencoding(False))
>>>UTF-8

windows 10 VM

import locale
print(locale.getpreferredencoding(False))
>>>cp1252

Traceback (most recent call last): File "metadata_updater.py", line 396, in main() File "metadata_updater.py", line 362, in main update_metadata(file, mapping[i]) File "metadata_updater.py", line 146, in update_metadata for line in file: File "C:\Users\user\AppData\Local\Programs\Python\Python35\Lib\fileinput.py", line 248, in next line = self._readline() File "C:\Users\user\AppData\Local\Programs\Python\Python35\Lib\fileinput.py", line 362, in _readline return self._readline() File "C:\Users\user\AppData\Local\Programs\Python\Python35\Lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 7147: character maps to

SPlanzer commented 6 years ago

fileinput can not use inplace=true and openhook=fileinput.hook_encoded("utf-8") in conjunctions

instead _locale._getdefaultlocale = (lambda *args: ['en_US', 'utf8']) has been used to force all text files to be read as utf-8