ElectionDataAnalysis / electiondata

Tools for consolidation and analysis of raw election results from the most reliable sources -- the election agencies themselves.
Other
20 stars 5 forks source link

File encoding issue on Windows #739

Closed sfsinger19103 closed 2 years ago

sfsinger19103 commented 2 years ago

From JOSS review

Having worked my way through some of the User Guide examples of results processing/handling with TestingData. I ran into a couple problems, particularly around Windows use, that I do think need fixing. First:

Using Windows, I ran into a UnicodeEncodeError from load_all() on the TestingData repo -- which fully crashed the process:

Munger warnings written to ./output\load_all_1123_1122\Alaska_munger_ak_gen_fwab.munger.warnings Traceback (most recent call last): File "[...]\electiontest\testing.py", line 3, in dl.load_all() File "[...]\electiondata.venv\lib\site-packages\electiondata-2.0-py3.9.egg\electiondata__init__.py", line 874, in load_all err = ui.report( f.write(out_str) File "C:\Python\39\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\x96' in position 969: character maps to I'm assuming it was just trying to write another warning, as I could re-run to resume the processing without further error. My assumption is that one of these input files was UTF-8 encoded, and could not be interpreted as Window's default CP-1252.

You should ensure that the program is Unicode-aware. For file handling see, e.g., https://stackoverflow.com/questions/19591458/python-reading-from-a-file-and-saving-to-utf-8 and for the full story, https://docs.python.org/3/howto/unicode.html

sfsinger19103 commented 2 years ago

Fixed.