Wrong datatype for long query using cursor.copy_from

hneiva commented 7 years ago

Getting the following error:

Traceback (most recent call last):
  File "search.py", line 128, in <module>
    curr.copy_from(file=output, table="inv.vc_event", sep=COLUMN_SEPARATOR, null=NULL, columns=COLUMNS)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/psycopg2cffi/_impl/cursor.py", line 30, in check_closed_
    return func(self, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/psycopg2cffi/_impl/cursor.py", line 53, in check_async_
    return func(self, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/psycopg2cffi/_impl/cursor.py", line 445, in copy_from
    self._pq_execute(query)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/psycopg2cffi/_impl/cursor.py", line 696, in _pq_execute
    self._pq_fetch()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/psycopg2cffi/_impl/cursor.py", line 747, in _pq_fetch
    return self._pq_fetch_copy_in()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/psycopg2cffi/_impl/cursor.py", line 830, in _pq_fetch_copy_in
    res = libpq.PQputCopyData(pgconn, data, len(data))
TypeError: initializer for ctype 'char *' must be a str or list or tuple, not unicode

Notes:

output is a StringIO in this case
Does not cause exception if output is small (a few records)
output.pos seems to always be at 8192 (which hints at buffering)
No problem when running on psycopg2/python

thedrow commented 7 years ago

cffi requires unicode to be wchar * as far as I recall.

hneiva commented 7 years ago

For reference:

I'm not experienced enough with Python or C to have a better look at it. Anyone has a way around this?

wiml commented 6 years ago

It seems that the line is not being encoded into the connection encoding before being given to PQputCopyData. (Python unicode objects don't have a single obvious C representation; you need to encode them into a bytes object in some specific encoding before giving them to C code. The note in the psycopg2 docs suggests that the correct encoding is the postgres connection encoding, which seems reasonable.)

chtd / psycopg2cffi

Wrong datatype for long query using cursor.copy_from #86