alexbrainman / odbc

odbc driver written in go
BSD 3-Clause "New" or "Revised" License
352 stars 140 forks source link

Is it possible to change the encoding? #91

Closed vorlif closed 7 years ago

vorlif commented 7 years ago

The python pyodbc driver support the change of the encoding. Is something like that possible with your driver?

The reason is: We are currently using a very old database system (Pervasive SQL 8) on work. If I try to select strings from the database with umlauts they are replaced with � signs.

Example:

id name
1 gebäude
2 other row
3 a other row

If I try the following:

db, err := sql.Open("odbc", "DSN=testdb;")
// Queries with umlauts works
rows, err := db.Query("SELECT id, name From dummy  WHERE name = ?", "gebäude")

var (
   id int
   name string
)

for rows.Next() {
   err := rows.Scan(&id, &name)
}

log.Println(id, name)

// Result: 
// 1 geb�ude

I think this is a problem with the encoding. Do you have an idea to fix the problem?

Thanks in advance for your effort and time.

alexbrainman commented 7 years ago

If you add this fmt.Printf:

diff --git a/column.go b/column.go
index 2408b68..b9af78f 100644
--- a/column.go
+++ b/column.go
@@ -68,6 +68,7 @@ func NewColumn(h api.SQLHSTMT, idx int) (Column, error) {
    b := &BaseColumn{
        name: api.UTF16ToString(namebuf[:namelen]),
    }
+   fmt.Printf("ZZZ: name=%q sqltype=%v\n", b.name, sqltype)
    switch sqltype {
    case api.SQL_BIT:
        return NewBindableColumn(b, api.SQL_C_BIT, 1), nil

and run your program again, what does it fmt.Printf prints? It would be easier to drop scanning into "id" from your "rows.Scan(&id, &name)" - we only care about strings here.

Can I reproduce your problem here? Do you run your Go program on Windows? What ODBC driver do you use? I don't have Pervasive SQL 8 here.

github.com/alexbrainman/odbc should work fine with unicode. But each database and ODBC driver has its own unicode issues. I know nothing about these. You say that python program works for you. Do you have any particular configurations?

Thank you.

Alex

vorlif commented 7 years ago

Hello Alex,

if you want we can close the issue. I've bypassed my problem with a own string type (Which implements the Value/Scan interface) and the windows1252 decoder from the charmap package.

System: Windows 10 Driver: The curse of my life - Pervasive Client Driver

fmt.Printf output:

ZZZ: name="name" sqltype=12

Here the end string öäü as byte: [246 228 252]

Thank you for your effort Florian

alexbrainman commented 7 years ago

ZZZ: name="name" sqltype=12

The sqltype=12 means that your ODBC driver reports SQL_VARCHAR as type of this column. SQL_VARCHAR is for strings consisting of bytes - each character can have value from 0 to 255. These can only store ASCII characters 0-127, plus another 128 characters that can be interpreted as anything else. If you know that these are windows1252, then what you are doing is all you can do.

Normally unicode characters are reported in ODBC with SQL_WCHAR and SQL_WVARCHAR. There are 2 reasons why you don't get that: 1) your database does not support unicode or you use wrong column types (consider char and nchar, varchar and nvarchar, text and ntext and others in MS SQL Server); 2) your data is stored as unicode, but you ODBC driver does not support it or has been misconfigured.

I do not know if your problem is 1 or 2, but I hope my explanation might give you some avenues to explore.

Please close the issue, if there is nothing to be done here.

Thank you

Alex

vorlif commented 7 years ago

Hello Alex,

in this case, I think my database don't support unicode. I close the issue.

Thank you Florian