tanrj / pyodbc

Automatically exported from code.google.com/p/pyodbc
MIT No Attribution
0 stars 0 forks source link

fetched 'remarks' data contains what looks like a buffer-overrun error #56

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
-----------------------------------------------
1. Using Access 2007 (or possibly other databases), create a new database 
(use the default *.accdb format), and save it as C:\MyDB.accdb

2. Create a new, empty table called "Table1" in that database.

3. Create a new field named "MyField" in that table.  In the 'Description' 
column of that field, put "THIS IS THE DESCRIPTION OF MY FIELD"

4. Launch python, and try the following code:
>>> import pyodbc
>>> con=pyodbc.connect('DRIVER={Microsoft Access Driver (*.mdb, 
*.accdb)};DBQ=C:\\MyDB.accdb')
>>> cursor = con.cursor()
>>> result = cursor.columns( table='Table1' ).fetchall()
>>> for row in result: print row
...or try this...
>>> for row in result: print row.remarks

What is the expected output? What do you see instead?
-----------------------------------------------
I expect to see the unicode string that I typed into Access, terminated 
correctly.  Instead, I see the following:

u'THIS IS THE DESCRIPTION OF MY FIELD\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\uffff\uffff\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\u0698\xb3\u1930\xb3'

Sometimes this string contains illegal number sequences that cause errors 
such as...
UnicodeEncodeError: 'charmap' codec can't encode characters in position 49-
50: character maps to <undefined>
Sometimes this string contains (near the end) other strings within it, of 
other parts of the database, such as other field names or values.

I also tried connecting to the database using a simple DSN, and the same 
error occurred.

What version of the product are you using? On what operating system?
-----------------------------------------------
(Windows XP, Python 2.5, pyODBC v2.1.6, Microsoft Access 2007)

Please provide any additional information below.
-----------------------------------------------
Please contact rmilne@hellbentgames.com for additional information or help 
in any way, in solving this problem.  If you can point me in the right 
direction, I would most certainly be able to help resolve the problem for 
you or with you (but I've never contributed to a project like this before).

NOTE: I attached a zip of the MyDB.accdb file in hopes that you can access 
it even without Microsoft Access installed.

Original issue reported on code.google.com by rogermilne@gmail.com on 11 Jun 2009 at 6:32

Attachments:

GoogleCodeExporter commented 8 years ago
Good catch.  I've reproduced it, but I'm not sure what to do about it so far.

The Unicode Access unit tests work great.  ODBC returns the correct length; for 
a 
254 character field, the length is 508 bytes.

When retrieving the remarks field, the value "this is a description" is 
returned as 
84 bytes, which would mean 42 Unicode characters which is exactly twice as wide 
as 
the actual value.

It appears to be this Access driver bug: http://support.microsoft.com/kb/q236871

Unfortunately it says both "BUFFER_LENGTH and CHAR_OCTET_LENGTH are 30 when 
they 
should also be 15" and "This behavior is by design".  I'm not sure how 
something can 
both be wrong and by design, but it appears it is.

It looks like it always has a terminating NULL, so that's a potential way to 
fix 
it.  I'm not sure I'm ready to always try to trim strings since it would (1) 
affect 
performance and (2) break code that expects embedded NULLs.  Perhaps I should 
set a 
flag when Cursor.columns is used?

In the meantime, you can use something like the following to work around it:

  cleaned = row and row.remarks.split(chr(0), 1)[0] or None

Original comment by mkleehammer on 8 Aug 2009 at 3:46

GoogleCodeExporter commented 8 years ago
Wow, that's interesting...  Leave it to Microsoft to make a "By Design" bug...  

ugh.  

Thanks for the work-around idea, it's much better than mine (I, of course, 
thought 
about the problem way too close to the issue, and made a for-loop that pulled 
characters one at a time,rebuilding a new string from it, up to the NULL).  DOH.

I also noticed a second issue on the same field:  I can enter text as long as I 
want 
into that field in Access, but if I exceed 128 characters, the string is 
returned 
as "None" or an empty string (can't remember which), in Python.  I don't think 
that 
one's nearly as work-around'able at this point :-(  I've had to just keep my 
text 
short and keep people aware of the issue when they add text as well.

Thanks for investigating the bug, I really appreciate it!!

    Roger

Original comment by rogermilne@gmail.com on 13 Aug 2009 at 5:01