tconbeer / harlequin

The SQL IDE for Your Terminal.
https://harlequin.sh
MIT License
3.74k stars 86 forks source link

Support ClickHouse and other databases #230

Open alexey-milovidov opened 1 year ago

alexey-milovidov commented 1 year ago

I want to use this IDE with ClickHouse (which supports similar scenarios but is more powerful than DuckDB).

tconbeer commented 1 year ago

I like where you're going, but I'm not planning to support this at the moment. We take advantage of duckDB's in-process architecture and python APIs quite a bit, so it's not totally straightforward to plug in a different database (which would require connecting over http, etc.).

You're welcome to fork the project, or if you're willing to wait, probably some day around Harlequin v2 or v3 I'll fork this and provide a general-purpose SQL client that can connect to clickhouse, postgres, etc.

alexey-milovidov commented 1 year ago

There is an in-process Python module of ClickHouse, named chdb: https://github.com/chdb-io/chdb

tconbeer commented 1 year ago

ok, interesting. That's closer, although still missing most of the APIs we use. I also don't understand the persistence model, if there is one? i.e., I don't think you can connect to an existing clickhouse db with chdb? seems designed to mostly operate on external files

alexey-milovidov commented 1 year ago

Both external and persistent tables are available. E.g., you can create a MergeTree table that will be stored locally. Or you can communicate with external ClickHouse using the remote table function. Or read/write local and remote files.

auxten commented 1 year ago

I'm willing to assist in integrating harlequin with chdb.

tconbeer commented 1 year ago

That's awesome, thanks so much. Let me noodle on this a little. First step is probably me enumerating all of the duckdb integrations so we can think through alternatives.

Mause commented 1 year ago

Worth mentioning that @alexey-milovidov is the Clickhouse CTO & designer, so he can probably fix any issues that you have with integrating Clickhouse as well (missing features, bugs, etc)

lmangani commented 1 year ago

ok, interesting. That's closer, although still missing most of the APIs we use. I also don't understand the persistence model, if there is one? i.e., I don't think you can connect to an existing clickhouse db with chdb? seems designed to mostly operate on external files

Hello @tconbeer chdb can connect to clickhouse-servers as well as directly access local or remote files with compatible formats. It supports session persistence using temporary or persistent local storage. The APIs are indeed are still quite basic, but this could be a great opportunity for us to find ways to extend and improve chdb feature coverage, so any input is greatly welcome!

tconbeer commented 1 year ago

Spent some time on this.

This wasn't in my original plan, but there isn't any reason Harlequin can't be the client to any database. The in-process part isn't very important.

I'll create an interface (an ABC or two) for a Harlequin database adapter, and refactor the DuckDB parts to be an implementation of that interface. The interface will include some optional features that may not be relevant or may not be possible for other databases, like loading extensions, exporting query results (copy to), providing autocompletions, etc. We'll distribute these adapters as separate plug-ins.

At start-up, Harlequin will discover which plugins are loaded, and then which features are available and adapt accordingly. Users will select their plugin and provide connection parameters at the command line. It would be nice to specify the connection in a config file (#206) instead (using profiles or similar).

I do want to be clear this feels like v2 to me, and that'll happen on my own timeline, with many of the smaller features shipping first. If y'all want to accelerate that, you can either start contributing to the project or you can pay me (I work as a contractor for my day job already). If you're interested in the latter, please email me (ted at shandy dot eye-oh) and we can schedule a call.

Finally, creating a delightful interface for DuckDB is still my first priority, so if I feel like this refactor puts that at risk for whatever reason, I reserve the right to change tacks and abandon this plan.

Thanks for your excitement around Harlequin -- I'm glad that it's pushed me to think about this as something bigger.

tconbeer commented 1 year ago

The pieces are there now, although undocumented. I'm going to implement an SQLite adapter to pressure-test the API and build out a guide for creating adapters; that should happen over the next week or two. If you want to get cracking sooner than that, you can work off the following examples:

[tool.poetry.plugins."harlequin.adapter"]
duckdb = "harlequin_duckdb:DuckDbAdapter"
sqlite = "harlequin_sqlite:HarlequinSqliteAdapter"

There isn't a great, generalizable test harness (yet). I'll likely be making big changes/improvements to the tests of the project to make it easier to test other adapters as I develop the SQLite adapter. You can reference the tests in tests/adapter_tests/test_duckdb.py; you could also swap out the fixtures in tests/conftest.py to use different adapter fixtures.

tconbeer commented 11 months ago

@auxten @alexey-milovidov @lmangani this is ready for you, now.

The guide for contributing an adapter can be found here: https://harlequin.sh/docs/contributing/adapter-guide A template repo is here: https://github.com/tconbeer/harlequin-adapter-template My postgres adapter can be found as a reference here: https://github.com/tconbeer/harlequin-postgres

The template repo includes a minimal set of tests.

The guide should include everything else you need to know, but feel free to ask any questions, or we can set up a quick call.

lmangani commented 11 months ago

Thanks @tconbeer this is amazing! I'll try this as soon as I get a time pocket.

dchimeno commented 11 months ago

is someone working on this? I could do it, but I don't want duplicate efforts :)

lmangani commented 11 months ago

@dchimeno I don't think anyone started - at least not from the chdb side - so feel free!

ThomAub commented 6 months ago

I started to get a basic ClickHouse adapter working you can find it here harlequin-clickhouse

@tconbeer It was great to look at the harlequin-postgres and use the adapter-template !

This is currently using clickhouse-driver without connection pool, it doesn't really provide completions and there is a lot to improve.

@auxten i wanted to go chdb all the way and use the remote option but this would require parsing the sql query to replace the FROM table_name with FROM remote(host, ..., table_name). I think i will create another adapter for chdb/clickhouse-local

auxten commented 6 months ago

Great to hear this! @ThomAub If any question encountered, feel free to ping me or raise an issue for chdb.

auxten commented 1 month ago

I started to get a basic ClickHouse adapter working you can find it here harlequin-clickhouse

@tconbeer It was great to look at the harlequin-postgres and use the adapter-template !

This is currently using clickhouse-driver without connection pool, it doesn't really provide completions and there is a lot to improve.

@auxten i wanted to go chdb all the way and use the remote option but this would require parsing the sql query to replace the FROM table_name with FROM remote(host, ..., table_name). I think i will create another adapter for chdb/clickhouse-local

Hey, @ThomAub anything we could help on chDB here?

ThomAub commented 1 month ago

hey, @auxten I think I will make distinct adapter because it's not easy to make a single adpater for both chDB and ClickHouse

auxten commented 1 month ago

Yeah, chDB and ClickHouse client are quite different. If there is anything we can do to help please let me know.

On Tue, 1 Oct 2024 at 5:19 AM, Thomas @.***> wrote:

hey, @auxten https://github.com/auxten I think I will make distinct adapter because it's not easy to make a single adpater for both chDB and ClickHouse

— Reply to this email directly, view it on GitHub https://github.com/tconbeer/harlequin/issues/230#issuecomment-2384162925, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB2UE4RH5PRBC2PCWJR4D3ZZG56DAVCNFSM6AAAAAA463VJE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBUGE3DEOJSGU . You are receiving this because you were mentioned.Message ID: @.***>

ThomAub commented 1 month ago

If there is anything we can do to help please let me know.

Do you think it should use the dbapi mode or the Session mode. I know the session api will soon be improve but will the public api be the same ? I don't know if the dbapi and cursor is the best way to create bot tmp and persistent session ?

auxten commented 1 month ago

I will keep the API stable during the improvement. chDB DBAPI can be persistent, examples are here and here Feel free to use DBAPI or Session; I'm here to help with any issues.