chdb-io / chdb

chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
https://clickhouse.com/docs/en/chdb
Apache License 2.0
2.03k stars 72 forks source link

s3 table function and engine breaks plpython3u (postgresql procedure language extension) #139

Closed Iced-Sun closed 10 months ago

Iced-Sun commented 10 months ago

Describe what's wrong

A clear and concise description of what works not as it is supposed to.

do language plpython3u $$
import chdb
chdb.query(sql='''select * from s3('https://bucket.s3.cn-northwest-1.amazonaws.com.cn/file.parquet') limit 10''')
$$
do language plpython3u $$
import chdb
chdb.query(sql='''
CREATE TABLE s3_engine_table (name String, value UInt32)
    ENGINE=S3('https://clickhouse-public-datasets.s3.amazonaws.com/my-test-bucket-768/test-data.csv.gz', 'CSV', 'gzip')
    SETTINGS input_format_with_names_use_header = 0;
INSERT INTO s3_engine_table VALUES ('one', 1), ('two', 2), ('three', 3);
SELECT * FROM s3_engine_table LIMIT 2;
''')
$$

The above queries cause the postgresql server terminated due to segfault. (server process (PID 1271677) was terminated by signal 6: Aborted; DETAIL: Failed process was running: do language plpython3u $$......)

A file table function works just fine.

Does it reproduce on recent release?

It reproduces on at v0.15.0

How to reproduce

Any s3 related query breaks plpython3u environment by segfault.

Expected behavior

A clear and concise description of what you expected to happen.

s3 related query should work with plpython3u.

Error message and/or stacktrace

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

lmangani commented 10 months ago

@Iced-Sun did you try the same but using a chdb persistent session? ephemeral queries are on suitable one-shot.

from chdb import session as chs
sess = chs.Session()
sess.query("...")

Also make sure to either include the USE statement or explicitly specify both db.table.

As of why postgres crashes, that's unrelated. If you think this is chdb related, please confirm it works with clickhouse-local.

Iced-Sun commented 10 months ago

Emm, it should be a plpython thing. Thanks anyway.