mtadayon / pypyodbc

Automatically exported from code.google.com/p/pypyodbc
0 stars 0 forks source link

UnicodeDecodeError because UTF-16 surrogate pairs are not handled #58

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. fetchall() on a query that produces string values that contains a character 
which a 4-byte UTF-16 representation (surrogate pair). For instance the emoji 
smiley u'\U0001f604'.

What is the expected output? What do you see instead?
The string values should be parsed correctly. However this exception is thrown:

UnicodeDecodeError: 'utf16' codec can't decode bytes in position 0-1: 
unexpected end of data

I think the problem is that UCS_dec() assumes that a character is always 
exactly one ucs_length long.

Original issue reported on code.google.com by stefan.m...@gmail.com on 12 Dec 2014 at 3:30

GoogleCodeExporter commented 9 years ago
Here's a quick-and-dirty fix 
https://github.com/smatting/pypyodbc/commit/49db4a7ca9b5794af6dc35d4989f4f27c1ca
2f0b

Original comment by stefan.m...@gmail.com on 12 Dec 2014 at 5:33