unicode object -> pyodbc -> unixodbc -> freetds fails

GoogleCodeExporter commented 8 years ago

What steps will reproduce the problem?
1. Install ubuntu
2. Install the packages:
unixodbc and unixodbc-dev: 2.2.11-16build3
tdsodbc: 0.82-4
libsybdb5: 0.82-4
freetds-common and freetds-dev: 0.82-4
3. Configure freetds.conf and unixodbc.ini as below
4. Install pyodbc latest version
5. Run the code below

What is the expected output? What do you see instead?
I expect to be able to insert unicode objects, but I see an error instead.
If I insert utf-8 data, I expect to get utf-8 data back, but instead I get
unicode objects, that when decoded don't match the utf-8 data I inserted in
first place.

What version of the product are you using? On what operating system?
Latest version on ubuntu 9.04

Please provide any additional information below.

I have configured /etc/unixodbc.ini like this:

[FreeTDS]
Description             = TDS driver (Sybase/MS SQL)
Driver          = /usr/lib/odbc/libtdsodbc.so
Setup           = /usr/lib/odbc/libtdsS.so
CPTimeout               = 
CPReuse         = 
UsageCount              = 2

I have configured /etc/freetds/freetds.conf like this:

[global]
    tds version = 8.0
    client charset = UTF-8

I have grabbed pyodbc revision 31e2fae4adbf1b2af1726e5668a3414cf46b454f
from http://github.com/mkleehammer/pyodbc and installed it using "python
setup.py install"

I have a windows machine with Microsoft SQL Server 2000 installed on my
local network, up and listening on the local ip address 10.32.42.69. I have
an empty database created with name "Common". I have the user "sa" with
password "secret" with full priviledges.

I am using the following python code to setup the connection:

import pyodbc
odbcstring =
"SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Geral;DRIVER=FreeTDS"
con = pyodbc.connect(s)
cur = con.cursor()
cur.execute('''
CREATE TABLE testing (
    id INTEGER NOT NULL IDENTITY(1,1), 
    name NVARCHAR(200) NULL, 
    PRIMARY KEY (id)
)
    ''')
con.commit()

Everything WORKS up to this point. I have used SQLServer's Enterprise
Manager on the server and the new table is there. Now I want to insert some
data on the table.

cur = con.cursor()
cur.execute('INSERT INTO testing (name) VALUES (?)', (u'áéí',))

That fails!! Seems like pyodbc won't accept a unicode object. Here's the
error I get:

pyodbc.Error: ('HY004', '[HY004] [FreeTDS][SQL Server]Invalid data type 
(0) (SQLBindParameter)'

Since my freetds client is configured to use UTF-8 as above, I thought I
could solve by encoding data to UTF-8. That gives no error, but then I get
back strange data when I query. pyodbc returns unicode strings, and decoded
with the wrong encoding so the chars are wrong.

If I can't insert an unicode string, why do I get those back? And wrong?

Original issue reported on code.google.com by nosklo@gmail.com on 3 Jun 2009 at 9:29

GoogleCodeExporter commented 8 years ago

You get Unicode back because that's the format the data is stored in in the 
database 
(NVARCHAR).  When you insert, ODBC, drivers, SQL Server, etc. perform the 
necessary 
conversion to convert to Unicode.  When you read, it is then unrelated to the 
insert, so to speak.

Can you generate an ODBC trace?  (I know -- that's always my first question...)

pyodbc certainly allows you to insert Unicode strings, so I'm thinking it's 
FreeTDS 
related.  I only spent a little time researching, but I haven't found anything 
recent and conclusive.  This looks interesting: 
http://www.freetds.org/userguide/unicodefreetds.htm

I'll keep looking, but the trace might help.

Original comment by mkleehammer on 8 Aug 2009 at 4:02

Changed state: Investigating

GoogleCodeExporter commented 8 years ago

I am facing this same issue while running our unit tests (that work fine on 
windows) 
on linux. The test is simply creating a temp table with an nvarchar column and 
trying 
to insert a unicode string with value u'unicode'. I would like to generate the 
trace 
and send it to you, but not sure how to do this. However, I noticed a trace 
file 
already being generated (not sure if pyodbc generates it or freetds), but I am 
attaching the file.

I am also attaching the trace extracted from sql server profiler. Here is the 
exported SQL from profiler:

EXEC sp_datatype_info 93,3
go
create table #test_returned_types ( nvarchar_col nvarchar(128) )
go
declare @p1 int
set @p1=NULL
exec sp_prepare @p1 output,N'@P1 varchar(80)',N'--- ARGS: (u''unicode'',)
insert into #test_returned_types (nvarchar_col) values (@P1)',1
select @p1
go
exec sp_unprepare 1
go
drop table #test_returned_types
go

I would really appreciate if a consensus can be obtained on where the problem 
is or 
how to fix/workaround it. I am willing to provide any further information you 
need.

Original comment by harid...@gmail.com on 12 Feb 2010 at 2:27

Attachments:

GoogleCodeExporter commented 8 years ago

Here is an update from my side. I tried the FreeTDS driver via jdbc-odbc bridge 
and 
ran the same insert with some chinese characters and it ran just fine. Both 
jdbc-odbc 
bridge and pyodbc go through the same odbc driver and the same freetds driver. 
Both 
python and java bind the same chinese characters, so doesn't this isolate the 
problem 
to pyodbc? I used the profiler that comes with SQL Server and captured the 
events to 
see if I can make out any difference. I didn't find anything interesting, but 
you 
might be able to make some difference, so I am attaching both the files. You 
may diff 
them side by side using vimdiff or some other visual diff tool to get a better 
idea. 
I really need to get this working, so I would appreciate any help in getting 
this to 
work.

Original comment by harid...@gmail.com on 13 Feb 2010 at 3:00

Attachments:

GoogleCodeExporter commented 8 years ago

I have made a couple of mistakes earlier, so rebuilt the profiler trace files 
again 
and attaching them. The diff shows this prominently:

pyodbc:
      <Column id="1" name="TextData">declare @p1 int
set @p1=NULL
exec sp_prepare @p1 output,N'@P1 varchar(80)',N'--- ARGS: (u''????!'',)
insert into #test_returned_types (nvarchar_col) values (@P1)',1
select @p1</Column>

jdbc-odbc:
      <Column id="1" name="TextData">declare @p1 int
set @p1=NULL
exec sp_prepare @p1 output,N'@P0 nvarchar(4000)',N'insert into 
#test_returned_types 
(nvarchar_col) values ( @P0 )',1
select @p1</Column>

You can see that the output type is different, varchar vs nvarchar. Is pyodbc 
incorrectly using varchar here causing the bind to fail?

Original comment by harid...@gmail.com on 17 Feb 2010 at 3:30

Attachments:

GoogleCodeExporter commented 8 years ago

This looks like the sort of unicode parameter handling bug which the FreeTDS
developers have recently fixed, and not strictly speaking a pyodbc problem.

Try removing the Ubuntu freetds-common, freetds-bin, and tdsodbc packages, 
which are
based on freetds-0.82, and installing the latest FreeTDS from source. The
installation steps look something like:

wget http://ftp.ibiblio.org/pub/Linux/ALPHA/freetds/current/freetds-current.tgz
tar xzf freetds-current.tgz
cd freetds-* (was freetds-0.83.dev.20100122 as I write this)
./configure --sysconfdir=/etc
make
sudo checkinstall --requires unixodbc
sudo odbcinst -i -d -f samples/unixodbc.freetds.driver.template

This assumes you have checkinstall and the require development packages, if not 
you
will first need something like:

sudo apt-get build-dep tdsodbc
sudo apt-get install checkinstall

Original comment by jkh...@gmail.com on 8 Apr 2010 at 5:11

GoogleCodeExporter commented 8 years ago

My memory on this is already fading, but we did actually use the latest FreeTDS 
version. The workaround for us was to use an explicit convert() on those 
columns while 
doing an insert. Select's worked fine without any additional convert's.

Original comment by harid...@gmail.com on 19 Apr 2010 at 8:32

GoogleCodeExporter commented 8 years ago

Can you try this with the latest source from the v2unicode build?  I believe I 
have fixed the UCS4 errors and some other Unicode issues.

Original comment by mkleehammer on 5 Sep 2010 at 6:23

GoogleCodeExporter commented 8 years ago

I believe this is fixed in 2.1.8, so I'm going to close for now.  Please reopen 
if it is still not working.

Original comment by mkleehammer on 6 Sep 2010 at 6:14

Changed state: Fixed

GoogleCodeExporter commented 8 years ago

Original comment by mkleehammer on 21 Nov 2010 at 4:44

Changed state: Complete

tanrj / pyodbc

unicode object -> pyodbc -> unixodbc -> freetds fails #52