DigitalCommons / open-data-and-maps

Deprecated: Implementation of Linked Open Data by the Solidarity Economy Association
6 stars 1 forks source link

Sort out character encoding issues from CSV #107

Closed matt-wallis closed 5 years ago

matt-wallis commented 5 years ago

There are examples in the newcastle-mapjam data, imported via CSV from a google spreadsheet, as raised by @Clara-dos-Santos in https://github.com/SolidarityEconomyAssociation/open-data-and-maps-outreach/issues/123#issuecomment-435022947

Need to establish what character encoding is being used in the g spreadsheet - (sigh, one of the 'joys' of CSV files is that they don't declare their character encoding 👎 )

matt-wallis commented 5 years ago

Note the use of iconv in to-utf8.rb. A precursor to this is to find out (by trial and error loading a CSV into LibreOffice) what is the encoding of the CSV. Here's a list of encodings supported by ICONV. The list is 6 years old, and may have grown since then.

ColmMassey commented 5 years ago

I think this covers the utf issue also present in the DotCoop data.

ColmMassey commented 5 years ago

See also https://github.com/SolidarityEconomyAssociation/open-data-and-maps-outreach/issues/171

ColmMassey commented 5 years ago

This seems to have been addressed, certainly in Oxford data.