exasol / pyexasol

Exasol Python driver with low overhead, fast HTTP transport and compression
MIT License
72 stars 39 forks source link

decimal separator is wrong when Exasol Column is DOUBLE and NLS_NUMERIC_CHARACTERS = ',.' #99

Closed huelf closed 1 year ago

huelf commented 2 years ago

Hi,

when I execute a query that returns a DOUBLE, and I executed "ALTER SESSION SET NLS_NUMERIC_CHARACTERS = ',.'" before, the Double Value is still in '9.9'-Format. But it should be '9,9'.

Example:

import pyexasol
import os
C = pyexasol.connect(dsn=os.environ['EX_DBHOST'], 
                                   user=os.environ['EX_DBUSER'], 
                                   password=os.environ['EX_DBPASSWORD'])

sql = """select cast(0.5 as DECIMAL(2,2)) as DECIMALCOLUMN
, cast(0.6 as DOUBLE) as DOUBLECOLUMN
from DUAL"""

# without ALTER SESSION (OK)
df = C.export_to_pandas(sql)
print(df.head())

cursor = C.execute(sql)
#print(cursor.columns())
for row in cursor:
    print(row)

# with ALTER SESSION (wrong)
stmt = C.execute("ALTER SESSION SET NLS_NUMERIC_CHARACTERS = ',.'")
df = C.export_to_pandas(sql)
print(df.head())

cursor = C.execute(sql)
#print(cursor.columns())
for row in cursor:
    print(row)

Result:

# without ALTER SESSION (OK)
  DECIMALCOLUMN  DOUBLECOLUMN
0            0.5           0.6
('0.5', 0.6)

# with ALTER SESSION (wrong)
  DECIMALCOLUMN DOUBLECOLUMN
0           0,5          0,6
('0,5', 0.6)

So, export_to_pandas() does as expected, but execute() should return '0,6', not 0.6.

Versions:

Python 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
Name: pyexasol
Version: 0.24.0

Unfortunately I cannot install the newest version on my environment.

Best regards Hans

littleK0i commented 2 years ago

Interesting. Could you please add debug=True connection option and attach logs?

I suspect, this is how Exasol WebSocket protocol returns float values, regardless of NLS_NUMERIC_CHARACTERS setting.

huelf commented 2 years ago

Here's the Log:

2022-11-30 07:50:33.811 Connection attempt [10.10.30.196:8563]
2022-11-30 07:50:33.842 [WebSocket request #1]
{
  "command": "login",
  "protocolVersion": 3
}
2022-11-30 07:50:33.842 [WebSocket response #1]
{
  "status": "ok",
  "responseData": {
    "publicKeyPem": "<public key>",
    "publicKeyExponent": "010001",
    "publicKeyModulus": "<modulus>"
  }
}
2022-11-30 07:50:33.983 [WebSocket request #2]
{
  "username": "<username>",
  "password": "<password>",
  "driverName": "PyEXASOL 0.24.0",
  "clientName": "PyEXASOL",
  "clientVersion": "0.24.0",
  "clientOs": "Windows-10-10.0.17763-SP0",
  "clientOsUsername": "<username>",
  "clientRuntime": "Python 3.9.7",
  "useCompression": false,
  "attributes": {
    "currentSchema": "",
    "autocommit": true,
    "queryTimeout": 0
  }
}
2022-11-30 07:50:33.998 [WebSocket response #2]
{
  "status": "ok",
  "responseData": {
    "protocolVersion": 3,
    "timeZone": "EUROPE/BERLIN",
    "timeZoneBehavior": "INVALID SHIFT AMBIGUOUS ST",
    "sessionId": 1750902846820319232,
    "maxDataMessageSize": 67108864,
    "releaseVersion": "7.1.6",
    "databaseName": "exa_db1_dev",
    "productName": "EXASolution",
    "maxIdentifierLength": 128,
    "maxVarcharLength": 2000000,
    "identifierQuoteString": "\""
  }
}
2022-11-30 07:50:33.998 [WebSocket request #3]
{
  "command": "getAttributes"
}
2022-11-30 07:50:34.029 [WebSocket response #3]
{
  "status": "ok",
  "attributes": {
    "datetimeFormat": "YYYY-MM-DD HH24:MI:SS.FF6",
    "dateFormat": "YYYY-MM-DD",
    "numericCharacters": ".,",
    "dateLanguage": "ENG",
    "queryTimeout": 0,
    "timezone": "EUROPE/BERLIN",
    "timezoneBehavior": "INVALID SHIFT AMBIGUOUS ST",
    "snapshotTransactionsEnabled": true,
    "defaultLikeEscapeCharacter": "\\",
    "autocommit": true,
    "compressionEnabled": false,
    "currentSchema": "",
    "openTransaction": 0
  }
}
2022-11-30 07:50:35.811 [WebSocket request #4]
{
  "command": "execute",
  "sqlText": "EXPORT (\nselect cast(0.5 as DECIMAL(2,2)) as DECIMALCOLUMN\n, cast(0.6 as DOUBLE) as DOUBLECOLUMN\nfrom DUAL\n) INTO CSV\nAT 'https://192.168.100.75:43360' FILE '000.csv'\nWITH COLUMN NAMES"
}
2022-11-30 07:50:36.998 [WebSocket response #4]
{
  "status": "ok",
  "responseData": {
    "results": [
      {
        "resultType": "rowCount",
        "rowCount": 1
      }
    ],
    "numResults": 1
  }
}
2022-11-30 07:50:37.967 [WebSocket request #5]
{
  "command": "execute",
  "sqlText": "select cast(0.5 as DECIMAL(2,2)) as DECIMALCOLUMN\n, cast(0.6 as DOUBLE) as DOUBLECOLUMN\nfrom DUAL"
}
2022-11-30 07:50:37.983 [WebSocket response #5]
{
  "status": "ok",
  "responseData": {
    "results": [
      {
        "resultType": "resultSet",
        "resultSet": {
          "numColumns": 2,
          "numRows": 1,
          "numRowsInMessage": 1,
          "columns": [
            {
              "name": "DECIMALCOLUMN",
              "dataType": {
                "type": "DECIMAL",
                "precision": 2,
                "scale": 2
              }
            },
            {
              "name": "DOUBLECOLUMN",
              "dataType": {
                "type": "DOUBLE"
              }
            }
          ],
          "data": [
            [
              "0.5"
            ],
            [
              0.6
            ]
          ]
        }
      }
    ],
    "numResults": 1
  }
}
2022-11-30 07:50:37.983 [WebSocket request #6]
{
  "command": "execute",
  "sqlText": "ALTER SESSION SET NLS_NUMERIC_CHARACTERS = ',.'"
}
2022-11-30 07:50:37.998 [WebSocket response #6]
{
  "status": "ok",
  "attributes": {
    "numericCharacters": ",."
  },
  "responseData": {
    "results": [
      {
        "resultType": "rowCount",
        "rowCount": 0
      }
    ],
    "numResults": 1
  }
}
2022-11-30 07:50:38.498 [WebSocket request #7]
{
  "command": "execute",
  "sqlText": "EXPORT (\nselect cast(0.5 as DECIMAL(2,2)) as DECIMALCOLUMN\n, cast(0.6 as DOUBLE) as DOUBLECOLUMN\nfrom DUAL\n) INTO CSV\nAT 'https://192.168.100.75:45003' FILE '000.csv'\nWITH COLUMN NAMES"
}
2022-11-30 07:50:39.654 [WebSocket response #7]
{
  "status": "ok",
  "responseData": {
    "results": [
      {
        "resultType": "rowCount",
        "rowCount": 1
      }
    ],
    "numResults": 1
  }
}
2022-11-30 07:50:39.654 [WebSocket request #8]
{
  "command": "execute",
  "sqlText": "select cast(0.5 as DECIMAL(2,2)) as DECIMALCOLUMN\n, cast(0.6 as DOUBLE) as DOUBLECOLUMN\nfrom DUAL"
}
2022-11-30 07:50:39.670 [WebSocket response #8]
{
  "status": "ok",
  "responseData": {
    "results": [
      {
        "resultType": "resultSet",
        "resultSet": {
          "numColumns": 2,
          "numRows": 1,
          "numRowsInMessage": 1,
          "columns": [
            {
              "name": "DECIMALCOLUMN",
              "dataType": {
                "type": "DECIMAL",
                "precision": 2,
                "scale": 2
              }
            },
            {
              "name": "DOUBLECOLUMN",
              "dataType": {
                "type": "DOUBLE"
              }
            }
          ],
          "data": [
            [
              "0,5"
            ],
            [
              0.6
            ]
          ]
        }
      }
    ],
    "numResults": 1
  }
}
2022-11-30 07:50:39.717 [WebSocket request #9]
{
  "command": "disconnect"
}
littleK0i commented 2 years ago

Ok, values are actually returned as floats from Exasol Websocket API.

The only reason why it works for pandas is that EXPORT command internally formats all values as strings, so NLS_NUMERIC_CHARACTERS is applied. It does not happen with normal fetching.

You may define custom fetch_mapper function to transform values manually.

Example of mapper function: https://github.com/exasol/pyexasol/blob/master/pyexasol/mapper.py#L62-L97

Example which sets mapper function as connection option: https://github.com/exasol/pyexasol/blob/master/examples/a04_fetch_mapper.py#L24

redcatbear commented 1 year ago

I am closing this ticket, since there was no response from the person asking in over a month since the last answer.