scraperwiki / spreadsheet-upload-tool

A ScraperWiki tool for uploading structured data from a CSV or Excel spreadsheet
3 stars 5 forks source link

utf-8 in CSV files breaks upload. #3

Open drj11 opened 11 years ago

drj11 commented 11 years ago

Somewhere in dumptruck we get an error in a CSV file has non-ASCII data:

Traceback (most recent call last):
  File "tool/code/extract.py", line 102, in <module>
main()
  File "tool/code/extract.py", line 98, in main
save(extract(filename))
  File "tool/code/extract.py", line 47, in save
scraperwiki.sql.save([], rows, table_name=sheetName)
  File "/usr/local/lib/python2.7/dist-packages/scraperwiki/sqlite.py", line 34, in save
return dt.upsert(data, table_name = table_name)
  File "/usr/local/lib/python2.7/dist-packages/dumptruck/dumptruck.py", line 301, in upsert
self.insert(upsert=True, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dumptruck/dumptruck.py", line 284, in insert
self.execute(sql, values, commit=False)
  File "/usr/local/lib/python2.7/dist-packages/dumptruck/dumptruck.py", line 136, in execute
self.cursor.execute(sql, *args)
ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

A reasonable fix for this would be to guess the encoding of the CSV file (perhaps by using file or whatever file does), and then convert each line separately before giving it to the CSV function:

There's an example: http://docs.python.org/2/library/csv.html#csv-examples