tconbeer / harlequin

The SQL IDE for Your Terminal.
https://harlequin.sh
MIT License
3.35k stars 75 forks source link

Support ClickHouse and other databases #230

Open alexey-milovidov opened 10 months ago

alexey-milovidov commented 10 months ago

I want to use this IDE with ClickHouse (which supports similar scenarios but is more powerful than DuckDB).

tconbeer commented 10 months ago

I like where you're going, but I'm not planning to support this at the moment. We take advantage of duckDB's in-process architecture and python APIs quite a bit, so it's not totally straightforward to plug in a different database (which would require connecting over http, etc.).

You're welcome to fork the project, or if you're willing to wait, probably some day around Harlequin v2 or v3 I'll fork this and provide a general-purpose SQL client that can connect to clickhouse, postgres, etc.

alexey-milovidov commented 10 months ago

There is an in-process Python module of ClickHouse, named chdb: https://github.com/chdb-io/chdb

tconbeer commented 10 months ago

ok, interesting. That's closer, although still missing most of the APIs we use. I also don't understand the persistence model, if there is one? i.e., I don't think you can connect to an existing clickhouse db with chdb? seems designed to mostly operate on external files

alexey-milovidov commented 10 months ago

Both external and persistent tables are available. E.g., you can create a MergeTree table that will be stored locally. Or you can communicate with external ClickHouse using the remote table function. Or read/write local and remote files.

auxten commented 10 months ago

I'm willing to assist in integrating harlequin with chdb.

tconbeer commented 10 months ago

That's awesome, thanks so much. Let me noodle on this a little. First step is probably me enumerating all of the duckdb integrations so we can think through alternatives.

Mause commented 10 months ago

Worth mentioning that @alexey-milovidov is the Clickhouse CTO & designer, so he can probably fix any issues that you have with integrating Clickhouse as well (missing features, bugs, etc)

lmangani commented 10 months ago

ok, interesting. That's closer, although still missing most of the APIs we use. I also don't understand the persistence model, if there is one? i.e., I don't think you can connect to an existing clickhouse db with chdb? seems designed to mostly operate on external files

Hello @tconbeer chdb can connect to clickhouse-servers as well as directly access local or remote files with compatible formats. It supports session persistence using temporary or persistent local storage. The APIs are indeed are still quite basic, but this could be a great opportunity for us to find ways to extend and improve chdb feature coverage, so any input is greatly welcome!

tconbeer commented 9 months ago

Spent some time on this.

This wasn't in my original plan, but there isn't any reason Harlequin can't be the client to any database. The in-process part isn't very important.

I'll create an interface (an ABC or two) for a Harlequin database adapter, and refactor the DuckDB parts to be an implementation of that interface. The interface will include some optional features that may not be relevant or may not be possible for other databases, like loading extensions, exporting query results (copy to), providing autocompletions, etc. We'll distribute these adapters as separate plug-ins.

At start-up, Harlequin will discover which plugins are loaded, and then which features are available and adapt accordingly. Users will select their plugin and provide connection parameters at the command line. It would be nice to specify the connection in a config file (#206) instead (using profiles or similar).

I do want to be clear this feels like v2 to me, and that'll happen on my own timeline, with many of the smaller features shipping first. If y'all want to accelerate that, you can either start contributing to the project or you can pay me (I work as a contractor for my day job already). If you're interested in the latter, please email me (ted at shandy dot eye-oh) and we can schedule a call.

Finally, creating a delightful interface for DuckDB is still my first priority, so if I feel like this refactor puts that at risk for whatever reason, I reserve the right to change tacks and abandon this plan.

Thanks for your excitement around Harlequin -- I'm glad that it's pushed me to think about this as something bigger.

tconbeer commented 8 months ago

The pieces are there now, although undocumented. I'm going to implement an SQLite adapter to pressure-test the API and build out a guide for creating adapters; that should happen over the next week or two. If you want to get cracking sooner than that, you can work off the following examples:

[tool.poetry.plugins."harlequin.adapter"]
duckdb = "harlequin_duckdb:DuckDbAdapter"
sqlite = "harlequin_sqlite:HarlequinSqliteAdapter"

There isn't a great, generalizable test harness (yet). I'll likely be making big changes/improvements to the tests of the project to make it easier to test other adapters as I develop the SQLite adapter. You can reference the tests in tests/adapter_tests/test_duckdb.py; you could also swap out the fixtures in tests/conftest.py to use different adapter fixtures.

tconbeer commented 7 months ago

@auxten @alexey-milovidov @lmangani this is ready for you, now.

The guide for contributing an adapter can be found here: https://harlequin.sh/docs/contributing/adapter-guide A template repo is here: https://github.com/tconbeer/harlequin-adapter-template My postgres adapter can be found as a reference here: https://github.com/tconbeer/harlequin-postgres

The template repo includes a minimal set of tests.

The guide should include everything else you need to know, but feel free to ask any questions, or we can set up a quick call.

lmangani commented 7 months ago

Thanks @tconbeer this is amazing! I'll try this as soon as I get a time pocket.

dchimeno commented 7 months ago

is someone working on this? I could do it, but I don't want duplicate efforts :)

lmangani commented 7 months ago

@dchimeno I don't think anyone started - at least not from the chdb side - so feel free!

ThomAub commented 2 months ago

I started to get a basic ClickHouse adapter working you can find it here harlequin-clickhouse

@tconbeer It was great to look at the harlequin-postgres and use the adapter-template !

This is currently using clickhouse-driver without connection pool, it doesn't really provide completions and there is a lot to improve.

@auxten i wanted to go chdb all the way and use the remote option but this would require parsing the sql query to replace the FROM table_name with FROM remote(host, ..., table_name). I think i will create another adapter for chdb/clickhouse-local

auxten commented 2 months ago

Great to hear this! @ThomAub If any question encountered, feel free to ping me or raise an issue for chdb.