chdb-io / chdb

chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
https://doc.chdb.io
Apache License 2.0
1.85k stars 71 forks source link

how to query postgre table? #225

Open wonb168 opened 4 weeks ago

wonb168 commented 4 weeks ago

data may stock in rdb,like postgre,and how to query it by chdb?

auxten commented 4 weeks ago

You can check docs here:

https://clickhouse.com/docs/en/engines/database-engines/postgresql

and

https://clickhouse.com/docs/en/engines/table-engines/integrations/postgresql

wonb168 commented 4 weeks ago

I tested, but:

>>> import chdb
>>> sql="""CREATE DATABASE chgp
... ENGINE = PostgreSQL('127.0.0.1:2345', 'mdmaster_peacebird_uat2', 'gpadmin', 'mysecretpassword', 'tenant_peacebird_adm',1);
... """
>>> chdb.query(sql)

>>> 
>>> chdb.query("SELECT count(*) FROM chgp.gto_skc_store_step_kpi_summary_duck;")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/python3.8/lib/python3.8/site-packages/chdb/__init__.py", line 78, in query
    raise ChdbError(res.error_message())
chdb.ChdbError: Code: 81. DB::Exception: Database chgp does not exist. (UNKNOWN_DATABASE)
>>> 

I want to use chdb to speed up postgre, calculate in chdb and then write the result back to postgre. How to do?thanks!

auxten commented 4 weeks ago

Try use session


from chdb import session as chs

## Create DB, Table, View in temp session, auto cleanup when session is deleted.
sess = chs.Session()
sess.query("CREATE DATABASE IF NOT EXISTS db_xxx ENGINE = Atomic")
sess.query("CREATE TABLE IF NOT EXISTS db_xxx.log_table_xxx (x String, y Int) ENGINE = Log;")
sess.query("INSERT INTO db_xxx.log_table_xxx VALUES ('a', 1), ('b', 3), ('c', 2), ('d', 5);")
sess.query(
    "CREATE VIEW db_xxx.view_xxx AS SELECT * FROM db_xxx.log_table_xxx LIMIT 4;"
)
print("Select from view:\n")
print(sess.query("SELECT * FROM db_xxx.view_xxx", "Pretty"))
wonb168 commented 4 weeks ago

Thank you, this way can query greenplum table now, but how to query temp table ? If use duckdb, like this:

import duckdb as dd
dd.execute("call postgres_execute('gp','create temp table tmp as select 1 id')")
dd.execute("select * from postgres_query('gp','select * from tmp')")

how to do in chdb?

and how to write chdb result back to gp?

thanks!

auxten commented 4 weeks ago
  1. You can create a view, It's more ordinary SQL
  2. There is an insert example: https://clickhouse.com/docs/en/engines/database-engines/postgresql#examples-of-use

There are also some full examples: https://clickhouse.com/blog/migrating-data-between-clickhouse-postgres https://clickhouse.com/blog/migrating-data-between-clickhouse-postgres-part-2

wonb168 commented 4 weeks ago

then I query gp table is OK,

sess.query("select * from public.base_update_org limit 1")

but how to create table in chdb?

sess.query("create table tmp Engine=Memory as select * from public.base_update_org limit 1")
sess.query("select * from tmp")

chdb.ChdbError: Code: 60. DB::Exception: Table _local.tmp does not exist. (UNKNOWN_TABLE)

auxten commented 4 weeks ago

The default _local db is not persist, you should create a new one to save your modifications. See:

https://github.com/chdb-io/chdb/blob/50fd104a12b422c4ef1cb45f8026721b0ea9f6d5/tests/test_usedb.py#L26-L38