exasol / pyexasol

Exasol Python driver with low overhead, fast HTTP transport and compression
MIT License
72 stars 39 forks source link

Mixed type error message #100

Closed janzinho closed 1 year ago

janzinho commented 2 years ago

Hi, I'm getting the following error message when trying to load a table from exasol into a pandas dataframe using pyexasol:

"/opt/conda/lib/python3.7/site-packages/pyexasol/connection.py:314: DtypeWarning: Columns (14,15) have mixed types.Specify dtype option on import or set low_memory=False.
  result = callback(http_thread.read_pipe, dst, **callback_params)"

The original columns (14,15) are formatted as TIMESTAMP in exasol and they are always NULL.

My script:

def query_from_exasol(query):
    exasol_connection = pyexasol.connect(
        dsn = 'xxx',
        user = input("Enter Exasol username: "),
        password = getpass.getpass("Enter Exasol password: ")
    )
    return exasol_connection.export_to_pandas(query)

QUERY =  """
select *
from schema.table;
    """

df = query_from_exasol(QUERY)

Best, Jan

littleK0i commented 2 years ago

If you see this warning, I suspect there are mixed types or some misalignment in data.

You may debug it by using .export_to_file() and trying to read this file later with pandas.read_csv().

Specifically with timestamps, I remember a few odd use cases, when you have timestamp values outside of bounds supported by pandas. For example, Exasol timestamp can go as low as 0001-01-01 00:00:00. Pandas can only go as low as 1700-something, which may cause values to be treated as string instead of timestamp.

It is hard to tell without seeing actual data.

redcatbear commented 1 year ago

@janzinho were you able to debug and maybe fix your problem with the suggestion @littleK0i provided?

redcatbear commented 1 year ago

Closing ticket, since there is no response from requester.