chdb-io / chdb

chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
https://clickhouse.com/docs/en/chdb
Apache License 2.0
2.03k stars 72 forks source link

Reimplement the session mode #197

Open auxten opened 7 months ago

auxten commented 7 months ago

Current chDB(till v2.0.2) relys on temp disk storage to keep the session data. Everytime session.query runs almost everything in memory will be recreated and reinit which caused a lot of state problems like:

Also some feature implemented and bugs walked around before also need a better way to fix:

Some chDB contributor also gave a try to make session better:

Originally posted by **l1t1** February 7, 2024 ```sql :) create table a engine=Memory as select 1 a; 0.11995077133178711 :) select * from a; Code: 60. DB::Exception: Table _local.a does not exist. (UNKNOWN_TABLE) 0.11473512649536133 ``` Here is how clickhouse-local interactive mode works: ``` root@0a8b55995b6e:/auxten/chdb/tests# ./ch24.5/usr/bin/clickhouse ClickHouse local version 24.5.1.1763 (official build). 0a8b55995b6e :) create table a engine=Memory as select 1 a; CREATE TABLE a ENGINE = Memory AS SELECT 1 AS a Query id: 967a5d72-bb39-4a42-8a11-a108eda2a5d9 Ok. 0 rows in set. Elapsed: 0.008 sec. 0a8b55995b6e :) select * from a; SELECT * FROM a Query id: e5be6b9b-b752-4418-8753-adb8cc69a127 ┌─a─┐ 1. │ 1 │ └───┘ 1 row in set. Elapsed: 0.008 sec. ```

The good part are:

  1. Better support for states like 'Memory Table Engine', 'UDF', 'SET', 'USE'
  2. Less tricky code to handle default database and 'SET', 'USE' statements
  3. Without load tables and do init on every query function call, Performance should be much better than current implementation
auxten commented 3 weeks ago

All issues with label https://github.com/chdb-io/chdb/labels/Session is related to this feature

225

258

261

https://github.com/chdb-io/chdb-node/issues/18

auxten commented 3 weeks ago

Here is the rough plan:

  1. Upgrade engine to 24.8 or newer, as ClickHouse engine did a lot of bugfix and optimization on clickhouse-local in recent 3 releases
  2. Better handling BackgroundSchedulePool and all kinds of Context
  3. Reimplement the session mode