Closed danthegoodman1 closed 1 month ago
In [19]: import chdb.dataframe as cdf
In [20]: tbl = cdf.Table(arrow_table=arw)
In [21]: ret_tbl = tby.query('select * from __table__')
In [22]: print(ret_tbl)
count()
0 3231245
Seems it's that easy, the chdb.dataframe
naming is a bit confusing
Bonus points for showing how to make it a virtual table with another name:
ret_tbl = cdf.query(sql='select * from __tb1__', tb1=cdf.Table(arrow_table=arw))
I will try to make chdb.dataframe just in chdb package. This way we could also query dataframe in chdb Session.
New chDB 2.0 API of query on ArrowTalbe: https://github.com/chdb-io/chdb/blob/d990ff80a54f5bf60ecea94d8cff8ec1f12c1d94/tests/test_query_py.py#L94-L137
The example for section
Query On Table (Pandas DataFrame, Parquet file/bytes, Arrow bytes)
on the readme has a dataframe example, but not pyarrow.Also, question: is it zero-copy select?