ClickHouse / metabase-clickhouse-driver

ClickHouse database driver for the Metabase business intelligence front-end
Apache License 2.0
476 stars 92 forks source link

`SET ROLE` statement problem on clustered instances #192

Closed mhkarimi1383 closed 5 months ago

mhkarimi1383 commented 1 year ago

Describe the bug

SET statement is not working with clustered instances

See https://github.com/ClickHouse/ClickHouse/discussions/40924

Steps to reproduce

  1. Install the latest version of driver
  2. Bring up a clickhouse instance (clustered)
  3. Add the instance and try to run query on that

Expected behaviour

Use HTTP Params or disable impersonation on clustered instances

Error log

Configuration

Environment

ClickHouse server

mhkarimi1383 commented 1 year ago

Any updates on this?

I was into fix that but I have a problem where I don't know what are the props of the first argument of (driver) in set-role-statement

I think adding a property in driver and user select that if ClickHouse instance is clustered or not, If yes this method can return empty string

https://github.com/ClickHouse/metabase-clickhouse-driver/blob/master/src/metabase/driver/clickhouse.clj#L203

slvrtrn commented 1 year ago

@mhkarimi1383, this is clearly an overlook from me. I think this feature (:connection-impersonation) should be turned off for now, as SET ROLE and other session-level commands are currently limited to a particular node even with session usage, which essentially renders them useless with pooled connections as they are executed as separate statements and there is no guarantee that the consecutive requests will use the same connection and inherit what was previously set there.

Stay tuned.

mhkarimi1383 commented 1 year ago

@slvrtrn Thanks, Also adding clustered ClickHouse to the current docker compose setup for testing will be great 😃

slvrtrn commented 1 year ago

@mhkarimi1383, I think it will introduce some complications as datasets are ENGINE Memory, and many of them are created in place, and we need to use wait_end_of_query and other parameters to ensure that the data is synchronized across the nodes, etc, etc...

But I agree that adding both an on-premise cluster of two nodes and ClickHouse Cloud to the mix is highly beneficial (that's how clickhouse-js CI is set up)

mhkarimi1383 commented 1 year ago

@slvrtrn If it's okay I will add clustered ClickHouse docker compose setup

slvrtrn commented 1 year ago

@mhkarimi1383, you could try (we already have a sample setup available, for example, here with all the necessary configs).

Don't expect that it will work out of the box, though. Some additional changes will be needed depending on cluster/non-cluster setup. E.g., how dataset DDLs look like, what additional options to pass (at least wait_end_of_query=1 + insert_quorum=$NODE_COUNT for cluster), and many other things.

See how different table creation is for single-node/cluster/CH Cloud in clickhouse-js tests: https://github.com/ClickHouse/clickhouse-js/blob/main/packages/client-common/__tests__/fixtures/simple_table.ts#L23-L48

Client instance is also different across these:

  1. General settings (all queries) https://github.com/ClickHouse/clickhouse-js/blob/main/packages/client-common/__tests__/utils/client.ts#L36-L41
  2. DDLs settings: https://github.com/ClickHouse/clickhouse-js/blob/main/packages/client-common/__tests__/utils/client.ts#L114-L117

All of this needs to be ported to Clojure tests somehow, and, to add a bit more: many tests, especially from the Metabase bundle which we run every time, use datasets created in place using tx/dataset-definition calls. ON CLUSTER part should also be squeezed in there somehow.

mhkarimi1383 commented 1 year ago

@slvrtrn Yeah I see, but I will try to do it as much as I can 😁

slvrtrn commented 1 year ago

@mhkarimi1383, please try https://github.com/ClickHouse/metabase-clickhouse-driver/releases/tag/1.2.2

mhkarimi1383 commented 1 year ago

@mhkarimi1383, please try https://github.com/ClickHouse/metabase-clickhouse-driver/releases/tag/1.2.2

I will do it today

mhkarimi1383 commented 1 year ago

@slvrtrn fixed, thank you :)

slvrtrn commented 5 months ago

I am closing this, as the offending feature was disabled in the end, and unblocked the driver usage on clustered instances.

Proper implementation of the "connection impersonation" feature that correctly works with clusters will be tracked in this one: https://github.com/ClickHouse/metabase-clickhouse-driver/issues/219