rana / ora

An Oracle database driver in Go.
MIT License
272 stars 66 forks source link

Reading CLOB looses UTF-8 data #196

Closed markotikvic closed 7 years ago

markotikvic commented 7 years ago

Hi, After upgrading to ora.v4 I can't seem to get the driver to read UTF-8 encoded CLOB from the database. This used to work in v3 by setting column type to ora.S and asserting result set to io.Reader and then using ioutil.ReadAll method to read the CLOB as []byte. Edit: The problem also appears in v3 after one point. I believe that defaulting [CB]LOB to string/[]byte might have caused this. Example:

func fetchJSON(db *ora.Ses, project string) ([]byte, error) {
    stmt, err := db.Prep(`SELECT
        JSON_CLOB
        FROM TABLES_CONFIG
        WHERE PROJEKAT` + project, ora.S)
    defer stmt.Close()

    if err != nil {
        return nil, err
    }

    rset, err := stmt.Qry()
    if err != nil {
        return nil, err
    }

    bytes := make([]byte, 0)
    if rset.Next() {
        lob := rset.Row[0].(io.Reader)
        if lob != nil {
            bytes, err = ioutil.ReadAll(lob)
            if err != nil {
                // TODO: Find a fix for this.
                // Some weird streaming read/write LOB error.
                // Ignore for now.
                //return nil, err
            }
        } else {
            return nil, errors.New("json config is null")
        }
    }

    return bytes, nil //bytes is a set of UTF-8 characters
}

But in v4 it's not possible to assert CLOB to io.Reader if the column type is set as string. Therefore I have to assert as string, but as I said it doesn't read UTF-8 characters correctly. Example:

func fetchJSON(db *ora.Ses, project string) ([]byte, error) {
        stmt, err := db.Prep(`SELECT JSON_CLOB FROM TABLES_CONFIG WHERE PROJEKAT` + EqualQuotes(project), ora.S)
        defer stmt.Close()
        if err != nil {
                return nil, err
        }

        rset, err := stmt.Qry()
        if err != nil {
                return nil, err
        }

        var data string
        if rset.Next() {
                data = rset.Row[0].(string)
        }

        fmt.Println(data) //data is not UTF-8
        return []byte(data), nil
}

I'm casting it to []byte at the end because I'm using json.Unmarshal afterwards. What the heck am I doing wrong (or is there a bug hidden somewhere in here)? I mean, it shouldn't be any more complicated than setting the column type to ora.S and asserting it as string afterwards, right?

Thank you in advance.

tgulacsi commented 7 years ago

Strange.

You can try selecting with ora.L as a column type, which should behave as in v3 (Reader or Lob). In the driver, we do no charset translation, connect to the DB and communicate with it in AL32UTF8 charset. May these col be NCLOB?

If you can share the exact column spec and some test data which reproduces the problem, we can test it.

markotikvic commented 7 years ago

In details, this is what worked:

// InitTables loads all payloads in the payloads variable.
// Returns an error if it fails.
func InitTables(db *ora.Ses, project string) error {
        jsonbuf, _ := fetchJSON(db, EqualQuotes(project))
        mu.Lock()
        defer mu.Unlock()
        json.Unmarshal(jsonbuf, &payloads)
        if len(payloads) == 0 {
                return errors.New("tables config is corrupt")
        }
        return nil
}

// fetchJSON returns a byte slice of JSON configuration file from TABLES_CONFIG table.
// Returns an error if it fails.
func fetchJSON(db *ora.Ses, project string) ([]byte, error) {
        stmt, err := db.Prep(`SELECT
                JSON_CLOB
                FROM TABLES_CONFIG
                WHERE PROJEKAT` + project, ora.S)
        defer stmt.Close()

        if err != nil {
                return nil, err
        }

        rset, err := stmt.Qry()
        if err != nil {
                return nil, err
        }

        bytes := make([]byte, 0)
        if rset.Next() {
                lob := rset.Row[0].(io.Reader)
                if lob != nil {
                        bytes, err = ioutil.ReadAll(lob)
                        if err != nil {
                                // TODO: Find a fix for this.
                                // Some weird streaming read/write LOB error.
                                // Ignore for now.
                                //return nil, err
                        }
                } else {
                        return nil, errors.New("json config is null")
                }
        }

        return bytes, nil
}

Ora v3 version (from the changelog):

## v3.7.1 ##
  * Fix defDate NULL handling error resulting in reading garbage as Time.
  * Add Ses.SetAction to be able to set the session's Module and Action attributes.

Column data type is set to CLOB without any special atributes. Results example:

    "method": "GET /api/v1/contracts",
    "params": {},
    "lang": [
        {
            "language": "en",
            "fieldsLabels": {
                "ID": "ID",
                "active": "ACTIVE",
                "contractNum": "CONTRACT NUMBER",
                "dateOfCreation": "DATE OF CREATION",
                "expirationDate": "EXPIRATION DATE",
                "idOds": "ODS",
                "idSupplier": "SUPPLIER",
                "internal": "INTERNAL",
                "js": "J/S",
                "note": "NOTE",
                "ods": "ODS",
                "reserved": "RESERVED",
                "seller": "SELLER",
                "sellerEmail": "SELLER (E‑MAIL)",
                "sellerPhone": "SELLER (PHONE)",
                "supplier": "SUPPLIER",
                "user": "BUYER",
                "userName": "BUYER NAME",
                "userTitle": "BUYER TITLE"
            }
        },
        {
            "language": "rs",
            "fieldsLabels": {
                "ID": "ID",
                "active": "AKTIVAN",
                "contractNum": "BROJ UGOVORA",
                "dateOfCreation": "DATUM",
                "expirationDate": "DATUM VAŽENJA",
                "idOds": "ODS",
                "idSupplier": "SNABDEVAČ",
                "internal": "INTERNI",
                "js": "J/S",
                "note": "NAPOMENA",
                "ods": "ODS",
                "reserved": "REZERVNO",
                "seller": "PRODAVAC",
                "sellerEmail": "PRODAVAC (E‑MAIL)",
                "sellerPhone": "PRODAVAC (TELEFON)",
                "supplier": "SNABDEVAČ",
                "user": "ŠIFRA KUPCA",
                "userName": "IME KUPCA",
                "userTitle": "KUPAC"
            }
        }
    ]
}

Same thing with ora v4 version:

## v4.1.9 ##
  * Fix context deadline issue 192.
// InitTables loads all payloads in the payloads variable.
// Returns an error if it fails.
func InitTables(db *ora.Ses, project string) error {
        jsonbuf, err := fetchJSON(db, project)
        if err != nil {
                return err
        }

        mu.Lock()
        defer mu.Unlock()
        json.Unmarshal(jsonbuf, &payloads)
        if len(payloads) == 0 {
                return errors.New("tables config is corrupt")
        }
        return nil
}

// fetchJSON returns a byte slice of JSON configuration file from TABLES_CONFIG table.
// Returns an error if it fails.
func fetchJSON(db *ora.Ses, project string) ([]byte, error) {
        db.SetCfg(db.Cfg().SetClob(ora.S))
        stmt, err := db.Prep(`SELECT JSON_CLOB FROM TABLES_CONFIG WHERE PROJEKAT` + EqualQuotes(project), ora.S)
        defer stmt.Close()
        if err != nil {
                return nil, err
        }

        rset, err := stmt.Qry()
        if err != nil {
                return nil, err
        }

        var data string
        if rset.Next() {
                data = rset.Row[0].(string)
        }

        //fmt.Println(data)
        return []byte(data), nil
}

Results:

{
    "method": "GET /api/v1/contracts",
    "params": {},
    "lang": [
        {
            "language": "en",
            "fieldsLabels": {
                "ID": "ID",
                "active": "ACTIVE",
                "contractNum": "CONTRACT NUMBER",
                "dateOfCreation": "DATE OF CREATION",
                "expirationDate": "EXPIRATION DATE",
                "idOds": "ODS",
                "idSupplier": "SUPPLIER",
                "internal": "INTERNAL",
                "js": "J/S",
                "note": "NOTE",
                "ods": "ODS",
                "reserved": "RESERVED",
                "seller": "SELLER",
                "sellerEmail": "SELLER (E‑MAIL)",
                "sellerPhone": "SELLER (PHONE)",
                "supplier": "SUPPLIER",
                "user": "BUYER",
                "userName": "BUYER NAME",
                "userTitle": "BUYER TITLE"
            }
        },
        {
            "language": "rs",
            "fieldsLabels": {
                "ID": "ID",
                "active": "AKTIVAN",
                "contractNum": "BROJ UGOVORA",
                "dateOfCreation": "DATUM",
                "expirationDate": "DATUM VA�ENJA",
                "idOds": "ODS",
                "idSupplier": "SNABDEVA�",
                "internal": "INTERNI",
                "js": "J/S",
                "note": "NAPOMENA",
                "ods": "ODS",
                "reserved": "REZERVNO",
                "seller": "PRODAVAC",
                "sellerEmail": "PRODAVAC (E‑MAIL)",
                "sellerPhone": "PRODAVAC (TELEFON)",
                "supplier": "SNABDEVA�",
                "user": "�IFRA KUPCA",
                "userName": "IME KUPCA",
                "userTitle": "KUPAC"
            }
        }
    ]
}

I should mention that the same problem occurs even with the latest ora v3 (pulled from v3 branch). I don't think I'll have the time to test what happens with NCLOB column, but I'll probably get to it tomorrow. Thanks for responding.

markotikvic commented 7 years ago

OK. Changing the column type to NCLOB (no LOB parameters set) solved the problem. Thank you.

I don't know if the issue can be closed. This does seem like a workaround to be honest.

tgulacsi commented 7 years ago

ora.v3 uses AL32UTF8 explicitly since 9c766dfb47a882a559cce4b82914c220137dc616 (2015-05-19). Before that it used what it got from the env: NLS_LANG & Co.

What is your database's native language set? What is your environment's NLS_LANG?

Can you copy here the result of a SELECT TO_CHAR(SUBSTR(theclobcol, 1, 1000)), dump(TO_CHAR(SUBSTR(theclobcol, 1, 1000))) from the clobtable

?

markotikvic commented 7 years ago

I have this done before I try to connect to DB

err := os.Setenv("NLS_LANG","_.WE8ISO8859P1")

Output:

[
    {
        "tableType": "OrderForm",
        "lang": [
            {
                "language": "en",
                "fieldsLabels": {
                    "contractId": "CONTRACT NUMBER",
                    "contractNum": "CONTRACT NUMBER",
                    "deliveryNode": "DELIVERY NODE ID",
                    "deliveryNodeName": "DELIVERY NODE NAME",
                    "hourlyMax": "HOURLY MAX",
                    "hourlyMin": "HOURLY MIN",
                    "dailyMax": "DAILY MAX",
                    "dailyMin": "DAILY MIN",
                    "jan": "JANUARY",
                    "feb": "FEBRUARY",
                    "mar": "MARCH",
                    "apr": "APRIL",
                    "may": "MAY",
                    "jun": "JUN",
                    "jul": "JULY",
                    "aug": "AUGUST",
                    "sep": "SEPTEMBER",
                    "oct": "OCTOBER",
                    "nov": "NOVEMBER",
                    "dec": "DECEMBER",
                    "timeToReplace": "TIME TO REPLACE",
                    "replacementPossible": "REPLACEMENT POSSIBLE",
                    "deliveryNodePhone": "DELIVERY NODE PHONE",
                    "capacity": "CAPACITY"
                }
            },
            {
                "language": "rs",
                "fieldsLabels": {
                    "contractId": "BROJ UGOVORA",
                    "contractNum": "BROJ UGOVORA",
                    "deliveryNode": "ŠIFRA MESTA ISPORUKE",
Typ=1 Len=1000: 91,10,9,123,10,9,9,34,116,97,98,108,101,84,121,112,101,34,58,32,34,79,114,100,101,114,70,111,114,109,34,44,10,9,9,34,108,97,110,103,34,58,32,91,10,9,9,9,123,10,9,9,9,9,34,108,97,110,103,117,97,103,101,34,58,32,34,101,110,34,44,10,9,9,9,9,34,102,105,101,108,100,115,76,97,98,101,108,115,34,58,32,123,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,73,100,34,58,32,34,67,79,78,84,82,65,67,84,32,78,85,77,66,69,82,34,44,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,78,117,109,34,58,32,34,67,79,78,84,82,65,67,84,32,78,85,77,66,69,82,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,34,58,32,34,68,69,76,73,86,69,82,89,32,78,79,68,69,32,73,68,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,78,97,109,101,34,58,32,34,68,69,76,73,86,69,82,89,32,78,79,68,69,32,78,65,77,69,34,44,10,9,9,9,9,9,34,104,111,117,114,108,121,77,97,120,34,58,32,34,72,79,85,82,76,89,32,77,65,88,34,44,10,9,9,9,9,9,34,104,111,117,114,108,121,77,105,110,34,58,32,34,72,79,85,82,76,89,32,77,73,78,34,44,10,9,9,9,9,9,34,100,97,105,108,121,77,97,120,34,58,32,34,68,65,73,76,89,32,77,65,88,34,44,10,9,9,9,9,9,34,100,97,105,108,121,77,105,110,34,58,32,34,68,65,73,76,89,32,77,73,78,34,44,10,9,9,9,9,9,34,106,97,110,34,58,32,34,74,65,78,85,65,82,89,34,44,10,9,9,9,9,9,34,102,101,98,34,58,32,34,70,69,66,82,85,65,82,89,34,44,10,9,9,9,9,9,34,109,97,114,34,58,32,34,77,65,82,67,72,34,44,10,9,9,9,9,9,34,97,112,114,34,58,32,34,65,80,82,73,76,34,44,10,9,9,9,9,9,34,109,97,121,34,58,32,34,77,65,89,34,44,10,9,9,9,9,9,34,106,117,110,34,58,32,34,74,85,78,34,44,10,9,9,9,9,9,34,106,117,108,34,58,32,34,74,85,76,89,34,44,10,9,9,9,9,9,34,97,117,103,34,58,32,34,65,85,71,85,83,84,34,44,10,9,9,9,9,9,34,115,101,112,34,58,32,34,83,69,80,84,69,77,66,69,82,34,44,10,9,9,9,9,9,34,111,99,116,34,58,32,34,79,67,84,79,66,69,82,34,44,10,9,9,9,9,9,34,110,111,118,34,58,32,34,78,79,86,69,77,66,69,82,34,44,10,9,9,9,9,9,34,100,101,99,34,58,32,34,68,69,67,69,77,66,69,82,34,44,10,9,9,9,9,9,34,116,105,109,101,84,111,82,101,112,108,97,99,101,34,58,32,34,84,73,77,69,32,84,79,32,82,69,80,76,65,67,69,34,44,10,9,9,9,9,9,34,114,101,112,108,97,99,101,109,101,110,116,80,111,115,115,105,98,108,101,34,58,32,34,82,69,80,76,65,67,69,77,69,78,84,32,80,79,83,83,73,66,76,69,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,80,104,111,110,101,34,58,32,34,68,69,76,73,86,69,82,89,32,78,79,68,69,32,80,72,79,78,69,34,44,10,9,9,9,9,9,34,99,97,112,97,99,105,116,121,34,58,32,34,67,65,80,65,67,73,84,89,34,10,9,9,9,9,125,10,9,9,9,125,44,10,9,9,9,123,10,9,9,9,9,34,108,97,110,103,117,97,103,101,34,58,32,34,114,115,34,44,10,9,9,9,9,34,102,105,101,108,100,115,76,97,98,101,108,115,34,58,32,123,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,73,100,34,58,32,34,66,82,79,74,32,85,71,79,86,79,82,65,34,44,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,78,117,109,34,58,32,34,66,82,79,74,32,85,71,79,86,79,82,65,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,34,58,32,34,169,73,70,82,65,32,77,69,83,84,65,32,73,83,80,79,82,85,75,69,34,44,10,9,9,9,9,9
tgulacsi commented 7 years ago

What about

err := os.Setenv("NLS_LANG","_.AL32UTF8")

or leaving it out completely?

'cause this is what the library does if you don't force it - and setting NLS_LANG in the process will confuse everything.

Marko Tikvic notifications@github.com ezt írta (időpont: 2017. máj. 30., K, 11:23):

I have this done before I try to connect to DB

err := os.Setenv("NLSLANG",".WE8ISO8859P1")

Output:

[ { "tableType": "OrderForm", "lang": [ { "language": "en", "fieldsLabels": { "contractId": "CONTRACT NUMBER", "contractNum": "CONTRACT NUMBER", "deliveryNode": "DELIVERY NODE ID", "deliveryNodeName": "DELIVERY NODE NAME", "hourlyMax": "HOURLY MAX", "hourlyMin": "HOURLY MIN", "dailyMax": "DAILY MAX", "dailyMin": "DAILY MIN", "jan": "JANUARY", "feb": "FEBRUARY", "mar": "MARCH", "apr": "APRIL", "may": "MAY", "jun": "JUN", "jul": "JULY", "aug": "AUGUST", "sep": "SEPTEMBER", "oct": "OCTOBER", "nov": "NOVEMBER", "dec": "DECEMBER", "timeToReplace": "TIME TO REPLACE", "replacementPossible": "REPLACEMENT POSSIBLE", "deliveryNodePhone": "DELIVERY NODE PHONE", "capacity": "CAPACITY" } }, { "language": "rs", "fieldsLabels": { "contractId": "BROJ UGOVORA", "contractNum": "BROJ UGOVORA", "deliveryNode": "ŠIFRA MESTA ISPORUKE",

Typ=1 Len=1000: 91,10,9,123,10,9,9,34,116,97,98,108,101,84,121,112,101,34,58,32,34,79,114,100,101,114,70,111,114,109,34,44,10,9,9,34,108,97,110,103,34,58,32,91,10,9,9,9,123,10,9,9,9,9,34,108,97,110,103,117,97,103,101,34,58,32,34,101,110,34,44,10,9,9,9,9,34,102,105,101,108,100,115,76,97,98,101,108,115,34,58,32,123,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,73,100,34,58,32,34,67,79,78,84,82,65,67,84,32,78,85,77,66,69,82,34,44,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,78,117,109,34,58,32,34,67,79,78,84,82,65,67,84,32,78,85,77,66,69,82,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,34,58,32,34,68,69,76,73,86,69,82,89,32,78,79,68,69,32,73,68,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,78,97,109,101,34,58,32,34,68,69,76,73,86,69,82,89,32,78,79,68,69,32,78,65,77,69,34,44,10,9,9,9,9,9,34,104,111,117,114,108,121,77,97,120,34,58,32,34,72,79,85,82,76,89,32,77,65,88,34,44,10,9,9,9,9,9,34,104,111,117,114,108,121,77,105,110,34,58,32,34,72,79,85,82,76,89,32,77,73,78,34,44,10,9,9,9,9,9,34,100,97,105,108,121,77,97,120,34,58,32,34,68,65,73,76,89,32,77,65,88,34,44,10,9,9,9,9,9,34,100,97,105,108,121,77,105,110,34,58,32,34,68,65,73,76,89,32,77,73,78,34,44,10,9,9,9,9,9,34,106,97,110,34,58,32,34,74,65,78,85,65,82,89,34,44,10,9,9,9,9,9,34,102,101,98,34,58,32,34,70,69,66,82,85,65,82,89,34,44,10,9,9,9,9,9,34,109,97,114,34,58,32,34,77,65,82,67,72,34,44,10,9,9,9,9,9,34,97,112,114,34,58,32,34,65,80,82,73,76,34,44,10,9,9,9,9,9,34,109,97,121,34,58,32,34,77,65,89,34,44,10,9,9,9,9,9,34,106,117,110,34,58,32,34,74,85,78,34,44,10,9,9,9,9,9,34,106,117,108,34,58,32,34,74,85,76,89,34,44,10,9,9,9,9,9,34,97,117,103,34,58,32,34,65,85,71,85,83,84,34,44,10,9,9,9,9,9,34,115,101,112,34,58,32,34,83,69,80,84,69,77,66,69,82,34,44,10,9,9,9,9,9,34,111,99,116,34,58,32,34,79,67,84,79,66,69,82,34,44,10,9,9,9,9,9,34,110,111,118,34,58,32,34,78,79,86,69,77,66,69,82,34,44,10,9,9,9,9,9,34,100,101,99,34,58,32,34,68,69,67,69,77,66,69,82,34,44,10,9,9,9,9,9,34,116,105,109,101,84,111,82,101,112,108,97,99,101,34,58,32,34,84,73,77,69,32,84,79,32,82,69,80,76,65,67,69,34,44,10,9,9,9,9,9,34,114,101,112,108,97,99,101,109,101,110,116,80,111,115,115,105,98,108,101,34,58,32,34,82,69,80,76,65,67,69,77,69,78,84,32,80,79,83,83,73,66,76,69,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,80,104,111,110,101,34,58,32,34,68,69,76,73,86,69,82,89,32,78,79,68,69,32,80,72,79,78,69,34,44,10,9,9,9,9,9,34,99,97,112,97,99,105,116,121,34,58,32,34,67,65,80,65,67,73,84,89,34,10,9,9,9,9,125,10,9,9,9,125,44,10,9,9,9,123,10,9,9,9,9,34,108,97,110,103,117,97,103,101,34,58,32,34,114,115,34,44,10,9,9,9,9,34,102,105,101,108,100,115,76,97,98,101,108,115,34,58,32,123,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,73,100,34,58,32,34,66,82,79,74,32,85,71,79,86,79,82,65,34,44,10,9,9,9,9,9,34,99,111,110,116,114,97,99,116,78,117,109,34,58,32,34,66,82,79,74,32,85,71,79,86,79,82,65,34,44,10,9,9,9,9,9,34,100,101,108,105,118,101,114,121,78,111,100,101,34,58,32,34,169,73,70,82,65,32,77,69,83,84,65,32,73,83,80,79,82,85,75,69,34,44,10,9,9,9,9,9

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/rana/ora/issues/196#issuecomment-304822799, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPoSslJpNaZ3BVPdkHm8hPaP4T5Fiu4ks5r--AOgaJpZM4NnU3K .

markotikvic commented 7 years ago

OK. I'll try that when I get the chance and I'll get back to you.