clirdlf / participatory_voting

http://voting.diglib.org
2 stars 0 forks source link

Special Characters on Import #3

Closed waynegraham closed 8 years ago

waynegraham commented 8 years ago

The import:conftool task was throwing Invalid byte code exceptions. This appears to be due to the curly apostrophe. Right now the system forces ISO-8859-1 encoding, but it's probably better to replace these characters with an actual apostrophe. Need an automated method for handling this.

waynegraham commented 8 years ago

Turns out that this is an issue with Excel in Office 2011. The Excel spreadsheet is in UTF-8, however, Excel on the Mac is incapable of reading anything other than 8-bit CSV. I couldn't even get it to parse corrected with iconv

iconv -f $(file -b --mime-encoding lib/assets/file.csv) -t utf-8 < lib/assets/file.csv > lib/assets/processed.csv

The "work around" is to use download the report, open it with LibreOffice and "Save As" to generate the CSV. I'll update the README.