alexbrainman / odbc

odbc driver written in go
BSD 3-Clause "New" or "Revised" License
352 stars 140 forks source link

"panic: runtime error: slice bounds out of range" when using utf-8 with informix odbc #98

Closed dani-91 closed 6 years ago

dani-91 commented 6 years ago

Hi,

here is my configuration: Windows 10 with informix driver client sdk 4.10 64bit (also tested with 32bit).

I used CLIENT_LOCALE = en_US.8859-1 in ODBC.INI and everything worked fine. But I want to use CLIENT_LOCALE = en_US.utf8. With utf8 I get the fallowing error if I have german umlaute like Ä,Ö, ü,ß etc.: `panic: runtime error: slice bounds out of range

goroutine 1 [running]: github.com/alexbrainman/odbc.(BindableColumn).Value(0xc04204a200, 0x9de930, 0x1, 0x4e2120, 0xc04204e1c8, 0x0, 0x0) C:/Users/dani-91/go/src/github.com/alexbrainman/odbc/column.go:254 +0x54f github.com/alexbrainman/odbc.(Rows).Next(0xc04206c020, 0xc042048680, 0x2, 0x2, 0x2, 0x2) C:/Users/dani-91/go/src/github.com/alexbrainman/odbc/rows.go:35 +0xc6 database/sql.(Rows).nextLocked(0xc042090000, 0xc042069d80) c:/go/src/database/sql/sql.go:2457 +0x77 database/sql.(Rows).Next.func1() c:/go/src/database/sql/sql.go:2442 +0x43 database/sql.withLock(0x58a940, 0xc042090030, 0xc042069dc8) c:/go/src/database/sql/sql.go:2855 +0x6c database/sql.(*Rows).Next(0xc042090000, 0x514068) c:/go/src/database/sql/sql.go:2441 +0x81 main.main() C:/Users/dani-91/go/src/hello/main.go:36 +0x3d9`

I debugged a bit and I found out that the error occurs in file column.go at the return statement in *func (c BindableColumn) Value(h api.SQLHSTMT, idx int) It's the statement c.BaseColumn.Value(c.Buffer[:c.Len]) The construction of slice c.Buffer[:c.Len] doesnt work because the buffer is smaller than c.Len. If I change the line to c.BaseColumn.Value(c.Buffer)** then it doesnt throw an error but then there some characters are cut off at the end of the string.

EDIT: Now I also tested with unixodbc on debian. It happened exactly the same.

Can you fix this?

alexbrainman commented 6 years ago

I used CLIENT_LOCALE = en_US.8859-1 in ODBC.INI and everything worked fine. But I want to use CLIENT_LOCALE = en_US.utf8. With utf8

Why did you do that?

C:/Users/dani-91/go/src/github.com/alexbrainman/odbc/column.go:254

The column.go:254 line https://github.com/alexbrainman/odbc/blob/master/column.go#L254 cannot panic with that message. Are you using latest version? Did you change the code?

It's the statement c.BaseColumn.Value(c.Buffer[:c.Len])

Can you tell me what the values of len(c.Buffer) and c.Len are? Maybe you could also try and understand how c.Buffer and c.Len got their values. What made them what they are?

Can you also show your program? Please make your program as small as possible - just leave only code that demonstrates the problem.

Thank you.

dani-91 commented 6 years ago

I found out the problem. With iso8859-1 every character has the same length. With UTF-8, the length can be bigger for example at umlaute like Ä or ö. The bufffer always has the length of the column of the table + 1 (i think because of the character end sign) with one character = one byte. And if an umlaut is used, the buffer is smaller then the text because an umlaut uses 2 byte.

Yes, i used the newest version. For example the column lenth is 50, the buffer length is 51 but it needs more for utf-8 Do you need the code or is that enough?

Thank you :)

alexbrainman commented 6 years ago

With iso8859-1 every character has the same length. With UTF-8, the length can be bigger

This driver does not support utf8 chars. If you look at the code, only SQL_CHAR and SQL_WCHAR are supported. I am surprised your program did not crash earlier. The ODBC should be reporting correct column types, does it report SQL_CHAR and SQL_WCHAR and then sends utf8? Can you try and ask for utf16 and see what happens?

Alex

dani-91 commented 6 years ago

This driver does not support utf8 chars.

Ok, i did not know that. I thought it is the best to use utf-8 because golang also uses utf-8 internal for variables. I tried to set the locale to utf-16 (en_US.utf16) but that didnt't work: {HY000} [Informix][Informix ODBC Driver][Informix]Unable to load locale categories.

Which locale do you recommend to use?

alexbrainman commented 6 years ago

I thought it is the best to use utf-8 because golang also uses utf-8 internal for variables.

Go does uses utf8 everywhere, but ODBC uses ASCII or UTF16. This driver translates either into utf8. But ODBC has to report correct column data type. What is the datatype that ODBC reports for that column? Also show me the SQL statement, what is the SQL column type?

I tried to set the locale to utf-16 (en_US.utf16) but that didnt't work: {HY000} [Informix][Informix ODBC Driver][Informix]Unable to load locale categories.

This error message does not help.

Which locale do you recommend to use?

I have never used Informix or its drivers. I would not know.

Alex

dani-91 commented 6 years ago

Here is the sql statement: rows, err := db.Query("select auftr_nr, text from texte where auftr_nr = ?;", 10209737) The column type of text is char(50)

Column type in rows.ColumnTypes() is 0xc04207e0f0

alexbrainman commented 6 years ago

Column type in rows.ColumnTypes() is 0xc04207e0f0

This is not it. Please, print sqltype variable value that is set in NewColumn. Thank you.

Alex

dani-91 commented 6 years ago

Sqltype is 1 (api.SQL_VARCHAR)

I tried to use api.SQL_WVARCHAR in case statement and now it converts properly

alexbrainman commented 6 years ago

Sqltype is 1 (api.SQL_VARCHAR)

1 is SQL_CHAR, not SQL_VARCHAR. SQL_VARCHAR is 12.

I tried to use api.SQL_WVARCHAR in case statement and now it converts properly

My code already handle api.SQL_WVARCHAR columns. Why didn't it work before? What did you change?

Alex

dani-91 commented 6 years ago

1 is SQL_CHAR, not SQL_VARCHAR. SQL_VARCHAR is 12.

Ou yes, you are right.

My code already handle api.SQL_WVARCHAR columns. Why didn't it work before? What did you change?

I made this change and now it works

switch sqltype {
    case api.SQL_BIT:
        return NewBindableColumn(b, api.SQL_C_BIT, 1), nil
    case api.SQL_TINYINT, api.SQL_SMALLINT, api.SQL_INTEGER:
        return NewBindableColumn(b, api.SQL_C_LONG, 4), nil
    case api.SQL_BIGINT:
        return NewBindableColumn(b, api.SQL_C_SBIGINT, 8), nil
    case api.SQL_NUMERIC, api.SQL_DECIMAL, api.SQL_FLOAT, api.SQL_REAL, api.SQL_DOUBLE:
        return NewBindableColumn(b, api.SQL_C_DOUBLE, 8), nil
    case api.SQL_TYPE_TIMESTAMP:
        var v api.SQL_TIMESTAMP_STRUCT
        return NewBindableColumn(b, api.SQL_C_TYPE_TIMESTAMP, int(unsafe.Sizeof(v))), nil
    case api.SQL_TYPE_DATE:
        var v api.SQL_DATE_STRUCT
        return NewBindableColumn(b, api.SQL_C_DATE, int(unsafe.Sizeof(v))), nil
    case api.SQL_TYPE_TIME:
        var v api.SQL_TIME_STRUCT
        return NewBindableColumn(b, api.SQL_C_TIME, int(unsafe.Sizeof(v))), nil
    case api.SQL_GUID:
        var v api.SQLGUID
        return NewBindableColumn(b, api.SQL_C_GUID, int(unsafe.Sizeof(v))), nil
        //  case api.SQL_CHAR, api.SQL_VARCHAR:
        //      return NewVariableWidthColumn(b, api.SQL_C_CHAR, size), nil
    //case api.SQL_WCHAR, api.SQL_WVARCHAR:
    case api.SQL_WCHAR, api.SQL_WVARCHAR, api.SQL_CHAR, api.SQL_VARCHAR:
        return NewVariableWidthColumn(b, api.SQL_C_WCHAR, size), nil
    case api.SQL_BINARY, api.SQL_VARBINARY:
        return NewVariableWidthColumn(b, api.SQL_C_BINARY, size), nil
    case api.SQL_LONGVARCHAR:
        return NewVariableWidthColumn(b, api.SQL_C_CHAR, 0), nil
    case api.SQL_WLONGVARCHAR, api.SQL_SS_XML:
        return NewVariableWidthColumn(b, api.SQL_C_WCHAR, 0), nil
    case api.SQL_LONGVARBINARY:
        return NewVariableWidthColumn(b, api.SQL_C_BINARY, 0), nil
    default:
        return nil, fmt.Errorf("unsupported column type %d", sqltype)
    }
alexbrainman commented 6 years ago

case api.SQL_WCHAR, api.SQL_WVARCHAR, api.SQL_CHAR, api.SQL_VARCHAR: return NewVariableWidthColumn(b, api.SQL_C_WCHAR, size), nil

So you are treating every character as SQL_WCHAR. That is obviously wrong in general, but if it works for you. The problem appears to be your SQLDescribeCol function (it is part of your ODBC driver), that reports SQL_CHAR for .SQL_WCHAR columns. Perhaps you have some config files or settings that makes your driver behave this way. I do not know, leaving this up to you to investigate / decide.

Alex

dani-91 commented 6 years ago

Thank you! That was ist. I found this setting for informix in odbc.ini: REPORTCHARCOLASWIDECHARCOL=1 And now your driver works without modification. That's great.

Have a nice weekend. Daniel

alexbrainman commented 6 years ago

Thank you! That was ist. I found this setting for informix in odbc.ini: REPORTCHARCOLASWIDECHARCOL=1 And now your driver works without modification. That's great.

I am happy for you. Closing this issue.

Alex

joshuasprow commented 3 years ago

I just ran into a related error and this issue was always at the top of my Googles. It's probably an edge case, but it cost me a couple days, so I thought I'd share.

I'm working on an ETL pulling from a Postgres 9.3.6 database that's running on a Windows 2016 server. While I was doing some test queries, I kept getting a panic at column.go:254. Eventually I added a defer recover function at the top of the Value method so I could capture the value of c. Each panic showed that c.Len was longer than c.Buffer and the column width was sometimes over 1000.

It turns out, Postgres allows VARCHAR columns without a specified length to be up to 1GB. Cool, huh? I haven't dug through the package code enough, but I'm guessing it assumes a 255 character max (naturally). If so, heads-up!

I'd offer to do a pull request, but it looks like the panic is happening inside rows.Next(). Correct me if I'm wrong, but people don't typically expect that to return an error. The best I can think of is to add a recover so the user can be notified of the offending column. Let me know if you'd like to look at a PR.

PS: I solved the problem by setting the Max Varchar value in odbc.ini to a value that was large enough for those nasty columns. Hopefully this saves someone some time.

alexbrainman commented 3 years ago

@joshuasprow why don't use Go native Postgres driver? Like everyone else.

I wrote this driver long time ago to access my MS SQL Server. I would not be surprised that it does not work very well for non MS SQL Server servers. This driver can be adjusted given enough effort, but why bother?

Alex

joshuasprow commented 3 years ago

@alexbrainman Thanks for the package! I know I'm not exactly using it for its intended purpose, but I don't think I'd be getting the job done without it.

I definitely tried using pgx and pq first, but the environment I'm working in requires I connect with ODBC. Unless you can suggest some magic that'll let me connect with DSN=PostgreSQL35W, this is what I've got. I would not recommend that anyone else do this.

The bigger problem is that the rest of the codebase is in Go and working great. I should probably rewrite it in C# or something, but that'd be a whole other can of worms.

Josh

alexbrainman commented 3 years ago

I definitely tried using pgx and pq first, but the environment I'm working in requires I connect with ODBC. Unless you can suggest some magic that'll let me connect with DSN=PostgreSQL35W, this is what I've got.

Thanks for explaining. If you have to use ODBC, then this package is, probably, the best choice.

I will happily accept changes if they don't affect other users (this package have few tests, and expect tests to pass after your changes). Unfortunately I don't have much free time these days. So will only give so much attention to this issue.

Also, please, open new issue describing your problem, if you decide to proceed.

Thank you.

Alex