sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
1.85k stars 146 forks source link

Connectorx not able to read 'infinite' value timestamps from postgres #572

Closed AndrewJackson2020 closed 4 months ago

AndrewJackson2020 commented 4 months ago

What language are you using?

Python

What version are you using?

0.3.2_alpha.2

What database are you using?

PostgreSQL

What dataframe are you using?

Pandas and polars

Can you describe your bug?

Connectorx not able to read 'infinite' value timestamps from postgres

What are the steps to reproduce the behavior?

I wrote a python script that is able to reproduce the behavior

Database setup if the error only happens on specific data or data type

See code below

Example query / code
import connectorx as cx
import pandas as pd
import sqlalchemy
from sqlalchemy.orm import Session

connection_string = "postgresql://postgres:@localhost:9876"
engine = sqlalchemy.create_engine("postgresql+psycopg2://postgres:@localhost:9876/")
session = Session(engine)

def setup() -> None:

  create_table_sql = """
  CREATE TABLE test_tbl (
    test_col timestamp with time zone
  );
  """
  session.execute(create_table_sql)

  insert_row_sql = """
  INSERT INTO test_tbl (test_col)
  VALUES ('infinity'::timestamp with time zone);
  """
  session.execute(insert_row_sql)
  session.commit()

def teardown() -> None:
  sql = 'DROP TABLE IF EXISTS test_tbl;'
  session.execute(sql)
  session.commit()

def test_pandas() -> None:
  setup()
  df = pd.read_sql(sql='SELECT * FROM test_tbl;', con=engine)
  print(df)
  teardown()

def test_cx() -> None:
  setup()
  df = cx.read_sql(conn=connection_string, query='SELECT * FROM test_tbl', return_type='pandas')
  print(df)
  teardown()

if __name__ == '__main__':
  teardown()
  test_pandas()
  test_cx()

What is the error?

Traceback (most recent call last): File "./test.py", line 49, in test_cx() File "./test.py", line 42, in test_cx df = cx.read_sql(conn=connection_string, query='SELECT * FROM test_tbl', return_type='pandas') File "/home/ajackson/repos/private/omcmono/.venv/lib/python3.8/site-packages/connectorx/init.py", line 264, in read_sql result = _read_sql( RuntimeError: error deserializing column 0: value too large to decode