steino / odbc

Automatically exported from code.google.com/p/odbc
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

'invalid UTF-8 in string' with mssql using latin1 #26

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I tried to set 'clientcharset' to 'latin1' or 'iso-8859-1' but I have 'invalid 
UTF-8 in string'.

Is your module able to decode the data on the fly like 
github.com/go-sql-driver/mysql does?

What version of the product are you using? On what operating system?
go version go1.1.2 windows/amd64
mssql 2005

Original issue reported on code.google.com by bigras.b...@gmail.com on 13 Nov 2013 at 8:49

GoogleCodeExporter commented 9 years ago
I don't understand what your problem is. Please provide some instructions so I 
can try to reproduce your problem here. Marking this issue as Invalid until you 
do.

> Is your module able to decode the data on the fly like 
github.com/go-sql-driver/mysql does?

I don't understand your question. You should be able to use standard utf8 Go 
strings and these will be saved into correspondent MS SQL columns properly. If 
you use non-ascii characters, database columns must be of appropriate type 
(nchar not char and so on).

Alex

Original comment by alex.bra...@gmail.com on 14 Nov 2013 at 4:06

GoogleCodeExporter commented 9 years ago
The MSSQL database I'm using already has data in latin1. I can't change that.

The column type is varchar(MAX) for the query I'm trying to use. I can't change 
that either.

In python I would only need to connect with the following line and I wouldn't 
need to decode from latin1 (_mssql would do it for me):
conn = _mssql.connect(server='ip', user='user', password='password', 
database=bd, charset='ISO-8859-1')

In go, for mysql with go-mysql I guess it would be (I didn't test it):
"user:pass@tcp(ip:port)/database?charset=latin1"

I hope my example is clear enough. Basically, the data is encoded in latin1 in 
a varchar(max) column and I wish I could tell code.google.com/p/odbc that I 
want the data in UTF-8 when I do Scan().

I don't know if it matters but the database collation is 'French_CI_AS'.

Original comment by bigras.b...@gmail.com on 14 Nov 2013 at 3:46

Attachments:

GoogleCodeExporter commented 9 years ago
> The MSSQL database I'm using already has data in latin1. ...

If you data is stored in char fields and encoded in latin1, then 
code.google.com/p/odbc won't help you to convert it into utf8. The driver only 
converts ansii in char and utf16 in nchar into Go utf8 strings. You need to use 
some other package to convert latin1 to utf8. Sorry.

Alex

Original comment by alex.bra...@gmail.com on 15 Nov 2013 at 5:51