methew / pyodbc

Automatically exported from code.google.com/p/pyodbc
MIT No Attribution
0 stars 0 forks source link

Loosening up encodings #330

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
In Python2, the user was responsible for encoding, making sure that what he 
sent through the odbc line was correct.

On Python3, 99% of the cases are much more clear, however it's not possible to 
query a database when a field has mixed encodings over records or contains 
other 'garbage' such as 'binary' encrypted data not matching any encoding, even 
zipped data or other trash that shouldn´t be there. Reality is, sometimes 
there is.

In windows ODBC this can be accomplish by unchecking 'perform character 
translation' in the ODBC settings and using Python27 so such things were 
possible albeit ugly enough.

Knowing this is all bad practise, why not setting unicode_results=True for a 
default and, when set to False, just give a bytes() object as before 
(PY_MAJOR_VERSION<3)? Then the Python programmer can decode() all he wants but 
only when he explicitely chose to get himself into such mess.

Checking the source, only getdata.cpp and params.cpp would need a minor (as in: 
two lines change) adjustment to enable this.

The modification would not lead to new untested behavior, just behavior more 
compatible with Python27, so I think it's relatively safe.

That way the Python programmer can 'have it his way' and be more compatible 
with any python2.7 project in existance in case the standard unicode way would 
really not do for his situation.

I'd be happy to commit such modification on Git if it was clear this was going 
to be accepted in the main branch, otherwise I'll just keep it in my private 
version.

For the rest: my compliments for pyodbc being an awesome extension.

Original issue reported on code.google.com by rbrt8...@gmail.com on 10 Jul 2013 at 11:08