sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
2.02k stars 163 forks source link

python connectorx query database data fail "Packets out of sync" #663

Open alistarhu opened 4 months ago

alistarhu commented 4 months ago

What language are you using?

Python

What version are you using?

connectorx=0.3.1

What database are you using?

I use starrocks-3.2.4 database.

What dataframe are you using?

PyMySQL=1.0.2
pyarrow=8.0.0
pandas=1.5.0
numpy=1.22.4

Can you describe your bug?

I can connect to database and query for data base on pymysql.

However when i use connectorx to do same query, it fail.

Besides, i run with setting env variable RUST_BACKTRACE=full, get the detail error message below

thread '<unnamed>' panicked at 'Couldn't convert Row { id: Bytes("23"), state: Bytes("Success") } to type alloc::string::String. (see FromRow documentation)', /github/home/.cargo/registry/src/github.com-1ecc6299db9ec823/mysql_common-0.27.5/src/row/convert/mod.rs:81:39
stack backtrace:
   0: rust_begin_unwind
             at /rustc/750bd1a7ff3e010611b97ee75d30b7cbf5f3a03c/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/750bd1a7ff3e010611b97ee75d30b7cbf5f3a03c/library/core/src/panicking.rs:142:14
   2: mysql_common::row::convert::FromRow::from_row
   3: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
   4: mysql::conn::queryable::Queryable::query
   5: <r2d2_mysql::pool::MysqlConnectionManager as r2d2::ManageConnection>::has_broken
   6: <r2d2::PooledConnection<M> as core::ops::drop::Drop>::drop
   7: core::ptr::drop_in_place<r2d2::PooledConnection<r2d2_mysql::pool::MysqlConnectionManager>>
   8: <connectorx::sources::mysql::MySQLSource<P> as connectorx::sources::Source>::fetch_metadata
   9: connectorx::dispatcher::Dispatcher<S,D,TP>::run
  10: connectorx::get_arrow::get_arrow
  11: connectorx::arrow::write_arrow
  12: connectorx::read_sql
  13: std::panicking::try
  14: connectorx::__pyo3_raw_read_sql
  15: cfunction_vectorcall_FASTCALL_KEYWORDS
             at /usr/local/src/conda/python-3.8.16/Objects/methodobject.c:441:24
  16: _PyObject_Vectorcall
             at /usr/local/src/conda/python-3.8.16/Include/cpython/abstract.h:127:11
  17: call_function
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:4963:13
  18: _PyEval_EvalFrameDefault
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:3515:19
  19: PyEval_EvalFrameEx
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:741:12
  20: _PyEval_EvalCodeWithName
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:4298:14
  21: _PyFunction_Vectorcall
             at /usr/local/src/conda/python-3.8.16/Objects/call.c:436:12
  22: _PyObject_Vectorcall
             at /usr/local/src/conda/python-3.8.16/Include/cpython/abstract.h:127:11
  23: call_function
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:4963:13
  24: _PyEval_EvalFrameDefault
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:3515:19
  25: PyEval_EvalFrameEx
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:741:12
  26: _PyEval_EvalCodeWithName
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:4298:14
  27: PyEval_EvalCodeEx
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:4327:12
  28: PyEval_EvalCode
             at /usr/local/src/conda/python-3.8.16/Python/ceval.c:718:12
  29: run_eval_code_obj
             at /usr/local/src/conda/python-3.8.16/Python/pythonrun.c:1166:9
  30: run_mod
             at /usr/local/src/conda/python-3.8.16/Python/pythonrun.c:1188:9
  31: pyrun_file
             at /usr/local/src/conda/python-3.8.16/Python/pythonrun.c:1085:15
  32: pyrun_simple_file
             at /usr/local/src/conda/python-3.8.16/Python/pythonrun.c:439:13
  33: PyRun_SimpleFileExFlags
             at /usr/local/src/conda/python-3.8.16/Python/pythonrun.c:472:15
  34: pymain_run_file
             at /usr/local/src/conda/python-3.8.16/Modules/main.c:391:15
  35: pymain_run_python
             at /usr/local/src/conda/python-3.8.16/Modules/main.c:616:21
  36: Py_RunMain
             at /usr/local/src/conda/python-3.8.16/Modules/main.c:695:5
  37: Py_BytesMain
             at /usr/local/src/conda/python-3.8.16/Modules/main.c:1127:12
  38: __libc_start_main
  39: <unknown>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Traceback (most recent call last):
  File "connectorx_debug.py", line 38, in <module>
    df = cx.read_sql(aiyield_engine_url, sql2, protocol='text', return_type='arrow')
  File "/home/amedac/miniconda3/envs/etl_env/lib/python3.8/site-packages/connectorx/__init__.py", line 257, in read_sql
    result = _read_sql(

It seem like some system basic lib is incompatible, could you give me some hint to

What are the steps to reproduce the behavior?

sample code:

import connectorx as cx

sql1 = 'select * from table_name'
sql2 = 'select state from table_name'

url = f'mysql://{user_name}:{passwd}@{host}:{port}/{db_name}'
df = cx.read_sql(url, sql1, protocol='text', return_type='arrow')
Database setup if the error only happens on specific data or data type
field_name type
id INT
state string