wireservice / agate-sql

agate-sql adds SQL read/write support to agate.
https://agate-sql.readthedocs.io
MIT License
18 stars 15 forks source link

Handle non-ascii text values in Python 2 #16

Closed JoeGermuska closed 4 years ago

JoeGermuska commented 8 years ago

over on the News Nerd slack, @chrislkeller reported problems with a unicode database.

Traceback (most recent call last):
  File "_init.py", line 16, in <module>
    new_table = agate.Table.from_sql('mysql:...', '...')
  File "/usr/local/lib/python2.7/site-packages/agatesql/table.py", line 87, in from_sql
    return agate.Table(rows, column_names, column_types)
  File "/usr/local/lib/python2.7/site-packages/agate/table/__init__.py", line 166, in __init__
    new_rows.append(Row(tuple(cast_funcs[i](d) for i, d in enumerate(row)), self._column_names))
  File "/usr/local/lib/python2.7/site-packages/agate/table/__init__.py", line 166, in <genexpr>
    new_rows.append(Row(tuple(cast_funcs[i](d) for i, d in enumerate(row)), self._column_names))
  File "/usr/local/lib/python2.7/site-packages/agate/data_types/text.py", line 36, in cast
    return six.text_type(d)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 51: ordinal not in range(128)

Further discussion in there suggested adding an encoding kwarg to from_sql. A little sniffing suggests that you can also manipulate the connection string to force results to utf-8, although I don't know what you'd do if a Connection were passed in, or if for some reason the string argument already had URL parameters.

jpmckinney commented 5 years ago

I don't think this is a bug. You need to provide agate-sql with UTF-8. That's how csvkit does it; it changes the encoding before passing to agate-sql.

jpmckinney commented 5 years ago

Ah, nevermind – this is to do with from_sql, not to_sql.

jpmckinney commented 4 years ago

Python 2 is now EOL.