farcepest / MySQLdb1

MySQL database connector for Python (legacy version)
https://sourceforge.net/projects/mysql-python/
666 stars 318 forks source link

Writing binary data with the driver triggers a Warning #125

Open thanatos opened 7 years ago

thanatos commented 7 years ago
>>> a_cursor.execute('INSERT INTO binarystuff VALUES (%s)', (MySQLdb.Binary('\xdd'),))
./env/bin/ipython:1: Warning: Invalid utf8 character string: 'DD'

(where binarystuff is a table,

CREATE TABLE `binarystuff` (
  `col` varbinary(10) NOT NULL,
  PRIMARY KEY (`col`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

This is because the SQL query that actually ends up being generated by the driver isn't valid. It appears to simply contain a 0xdd byte right in the middle of the query string. (Which isn't valid UTF-8, and hence the warning. Somehow MySQL does the right thing in the end, but warns you for it.)

The value needs to be escaped into a binary string literal:

INSERT INTO binarystuff VALUES (x'dd');

The driver conflates text and binary here:

def Binary(x):
    return str(x)

Because of this, it can't tell the difference between a str holding text and a str holding binary data, in Python 2.

Really, Binary (in Python 2) needs to be a real wrapper, s.t. when the time to escape the values comes, they can be properly escaped. (It seems like either hex escaping can be used, or perhaps prefixing the literal with _binary might work.)

In Python 3, the difference between bytes and str is more rigorous, and the type alone contains sufficient information to properly escape.

~Edit: there is a fork; they appear to have fixed this by making it bytearray on Python 2.~ nope they have the bug too

farcepest commented 7 years ago

try making your query string unicode