ClickHouse / dbt-clickhouse

The Clickhouse plugin for dbt (data build tool)
Apache License 2.0
252 stars 111 forks source link

Authenticate via SSH key instead of password #339

Open babaMar opened 3 months ago

babaMar commented 3 months ago

Is your feature request related to a problem? Please describe. I'd like to authenticate to ClickHouse via SSH key instead of password. I already have multiple users that have been created without password (... IDENTIFIED BY ssh_key ...) that needs to run DBT (I'd like to avoid creating a DBT user with password and share credentials around).

Describe the solution you'd like I was thinking the private_key_file field can be added to profiles.yml, I've seen this solution for Snowflake for example.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context I could help with this if you can point me to the right direction.

genzgd commented 3 months ago

This would require updating one of the underlying Python clients to support ssh authentication. I believe that is theoretically possible with clickhouse-driver, but I don't know if there are plans to do so. For HTTP/clickhouse-connect, this is not currently possible as ClickHouse does not support ssh authentication over HTTP.

genzgd commented 3 months ago

Note that using a TLS/SSL private certificate should actually work if the ClickHouse server supports mutual TLS.

babaMar commented 3 months ago

I understand @genzgd thanks for your reply. So I guess at the moment dbt-clickhouse uses clickhouse-connect, can you point me to where the magic happens? I.E. how the credentials are passed along from profiles.yaml to the CH client to establish the connection?

genzgd commented 3 months ago

Note that dbt-clickhouse will use the clickhouse-driver client if the profile "driver" field is set to native.

Looking at the dbt-clickhouse code, unfortunately it looks like it would require a code change to support adding the key and certificate files to the clickhouse-connect constructor function. The relevant code is here.

We would need a way to pass additional constructor parameters from the credentials. My first thought is to add something like "client_parameters" as a dictionary in the credentials file and pass them as keyword args. That would avoid having to actually update the dbt-clickhouse code for every change in clickhouse-connect (or similar changes for clickhouse-driver).

Sample code for how to configure clickhouse-connect for mutual TLS authentication is in the integration test here.

babaMar commented 3 months ago

@genzgd thanks a lot for the info. I'll have a look. So you're saying changes would need to happen in two places? The second link is pointing to the same place of the first link if I'm not mistaken.

babaMar commented 3 months ago

@genzgd it seems the clickhouse_driver.Client supports the keyfile:

chc = Client(
    host='<host>',
    secure=True,
    user='<user>',
    keyfile='<myprivatekey>'
)

chc.execute('SHOW DATABASES')

though I'm not sure if that's passed along, it seems still looking for a password:

DB::Exception: securityadmin: Authentication failed: password is incorrect, or there is no user with such name.. Stack trace:

genzgd commented 3 months ago

I suspect that keyFile is for mutual TLS, not ssh authentication.