Closed GoogleCodeExporter closed 9 years ago
I've just set up an environment using what I have at hand:
Ubuntu 8.10
freetds's tdsodbc 0.82-3ubuntu1 (as shipped with the distribution)
unixodbc 2.2.11-16build2 (as shipped with the distribution)
pyodbc 2.4.1
Your unixodbc and FreeTDs configuration files
SQL Server 2000
(btw it seems you are mixing some settings from odbc.ini in odbcinst.ini)
and tested saving an example model containing 'áéíóúñÁÉÍÓÚÑÇç'
that worked without
problems. This confirms what I suspected:
0.63 is a too old version of FreeTDS to be useful (it was released almost four
years
ago) and I wouldn't be surprised if its support of UTF-8 as a client-side
encoding
was rather immature then.
For instance, compare the wording used in 0.63 users guide (nonwestern.htm file)
"
Important FreeTDS is not fully compatible with multi-byte character sets such
as UCS-2. You must use an ASCII-extension charset (e.g., UTF-8,
ISO-8859-*)[1]. Extreme care should be taken with testing
applications using these encodings. Specifically, many applications
do not expect the number of characters returned to exceed the column
size (in bytes). On the other hand, support of UTF-8 and UCS-2 is a
high priority for the developers. Patches and bug reports in this
area are especially welcome.
"
With the same paragraph in the equivalent 0.82 document (localization.htm):
"
Important FreeTDS is not fully compatible with multi-byte character sets such
as UCS-2. You must use an ASCII-extension charset (e.g., UTF-8,
ISO-8859-*)[2]. Great care should be taken testing applications using
these encodings. Specifically, many applications do not expect the
number of characters returned to exceed the column size (in bytes).
"
So you might want to try compiling and testing a newer FreeTDS version.
FYI, Additionally, I've found that even 0.82 isn't bug-free enough and as
FreeTDS
trunk is currently rather unstable I use the "official patched 0.82 version"
from
http://freetds.sourceforge.net/. I'm using the Django test suite as a way to
measure
this and the difference in numbers of failures between 0.82 and the "official
patched 0.82 version" is abysmal (in the first case there is a lot of them
caused
by FreeTDS errors and in the latter case the remaining test suite failures are
all
attributable to django-pyodbc itself).
This has even allowed me to do all my django-pyodbc development on Linux
and only periodically validate things on win32 (using MS ODBC drivers)
I will leave the ticket open, please report back your experience and
conclusions if
possible.
Original comment by cra...@gmail.com
on 2 Feb 2009 at 8:48
The pyodbc documentation says that unicode handling for the MSSQL TDS ODBC
driver is
problematic because python stores unicode strings in UCS-4 and the driver
returns
them in UCS-2 - the pyodbc layer does not translate UCS-2<=>UCS-4.
Pyodbc docs suggests compiling python with UCS-2 flags; when I tried that, I got
other error messages.
WORKAROUND FOR READ-ONLY MODELS: My application uses data from MSSQL in a
read-only
fashion, so I managed to create some views using the PostgreSQL module
dblink_tds.
This module works as expected with unicode, but joins using dblink_tds views are
somewhat inefficient (each view retrieves all rows from MSSQL before joining).
For
reazonable size databases with shallow nested data models this may be
acceptable.
Original comment by paulo.sc...@gmail.com
on 4 Feb 2009 at 8:37
In which version of pyodbc documentatation did you read that? (BTW, What
version of
pyodbc are you using?)
Because I don't find anything like that in he current (2.1.4) pyodbc source
code nor
documentation.
And I think that's actually wrong. The TDS protocol uses UCS-2 over the wire but
FreeTDS converts it to/from the client charset you specify in freetds.conf by
using
the iconv library and so client applitactions don`t deal at all with UCS-2 data.
A long term aim of FreeTDS is to be able to offer an Unicode inteface to the
client
apps, and that (hopefully with similar advances in the pyodbc and django-pyodbc
fronts) could be a good match for the Django Unicode support when talking with
DB
backends (i.e. no encoding/decoding would be needed in django-pyodb).
Meanwhile, we need that FreeTDS talk to us using UTF-8 (so the "client charset =
UFT-8" freetds.conf setting is needed and we hardcode the UTF-8 encoding of
Unicode
data handed to us by Django and UTF-8 decoding of data we get from the DB)
I will close this ticket a week from now.
Original comment by cra...@gmail.com
on 5 Feb 2009 at 8:36
Original comment by cra...@gmail.com
on 14 Feb 2009 at 2:23
I have hit the same problem, but with only some characters.
For example, the following causes a system crash: Mečová (specifically č),
but when I
try áéíóúñÁÉÍÓÚÑÇç it works just fine.
I have 0.82 of FreeTDS, and I can replicate on both a Mac (installed through
MacPorts) and an Ubuntu (standard 8.10 package) system.
When I try it on Windows, it works perfectly.
Any suggestions are very welcome, as I'm really trying to avoid using Windows
for web
servers in a live deployment.
Original comment by matt.j.s...@gmail.com
on 12 Mar 2009 at 6:06
Is there a workaround for this?
Original comment by djmar...@gmail.com
on 13 Dec 2010 at 7:25
We have tested with more recent builds of FreeTDS and unixODBC (latest with the
10.4 Ubuntu packages) and the problem seems to have disappeared. Our problems
must have been with the underlying drivers.
Original comment by matt.j.s...@gmail.com
on 13 Dec 2010 at 7:37
Does 'Mečová' work for you now? On my ubuntu 10.04 with MSSQL 2005 it still
doesnt.
Original comment by djmar...@gmail.com
on 13 Dec 2010 at 7:46
Would it be possible to drop the characters that are causing problems before
they hit the driver as a workaround?
Original comment by djmar...@gmail.com
on 14 Dec 2010 at 8:46
Original issue reported on code.google.com by
paulo.sc...@gmail.com
on 28 Jan 2009 at 11:39