ClickHouse / dbt-clickhouse

The Clickhouse plugin for dbt (data build tool)
Apache License 2.0
254 stars 113 forks source link

Running DBT on MacOS with dbt-clickhouse does not support renameat2() #340

Open LPauzies opened 3 months ago

LPauzies commented 3 months ago

Describe the bug

When running : dbt run --target=clickhouse-dev --select=dbt_mart on MacOS Sonoma 14.6.1 (MacBook Pro (13-inch, M1, 2020))

08:27:45  Running with dbt=1.7.17
08:27:46  Registered adapter: clickhouse=1.7.7
08:27:46  Found 65 models, 3 seeds, 280 tests, 85 sources, 0 exposures, 0 metrics, 942 macros, 0 groups, 0 semantic models
08:27:46  
08:27:48  dbt_clickhouse adapter: ClickHouse server does not support the EXCHANGE TABLES command
08:27:48  dbt_clickhouse adapter: This can be caused by an obsolete ClickHouse version or by running ClickHouse on
08:27:48  dbt_clickhouse adapter: an operating system that does not support the low level renameat2() system call.
08:27:48  dbt_clickhouse adapter: Some DBT materializations will be slower and not atomic as a result.
08:27:49  dbt_clickhouse adapter: Unexpected server exception dropping table
08:27:49  
08:27:49  Finished running  in 0 hours 0 minutes and 2.68 seconds (2.68s).
08:27:49  Encountered an error:
Database Error
  'NoneType' object has no attribute 'version_major'
make: *** [dbt-run] Error 2

With :

While the same code is running perfectly in Linux distributions.

Steps to reproduce

  1. Get a MacOS Sonoma 14.6.1
  2. Create a DBT project
  3. Create a clickhouse-server instance with version 23.8.4.69
  4. Create a connection profile for this instance called "my_profile"
  5. Create a dummy model called "my_mart" in the DBT project
  6. Execute dbt run --target=my_profile --select=my_mart

Expected behaviour

A correct DBT materialization

Code examples, such as models or profile settings

profiles.yml

clickhouse-dev:
      type: clickhouse
      driver: native
      secure: true
      verify: false
      schema: dbt
      retries: 3
      connect_timeout: 100
      send_receive_timeout: 3000
      sync_request_timeout: 15
      threads: 4
      host: localhost # SSH tunneling
      port: 9430  # ssh port mapping
      user: user
      password: password
      custom_settings: {max_bytes_before_external_group_by: '68719480000', max_bytes_before_external_sort: '68719480000'}

dbt and/or ClickHouse server logs

08:27:45  Running with dbt=1.7.17
08:27:46  Registered adapter: clickhouse=1.7.7
08:27:46  Found 65 models, 3 seeds, 280 tests, 85 sources, 0 exposures, 0 metrics, 942 macros, 0 groups, 0 semantic models
08:27:46  
08:27:48  dbt_clickhouse adapter: ClickHouse server does not support the EXCHANGE TABLES command
08:27:48  dbt_clickhouse adapter: This can be caused by an obsolete ClickHouse version or by running ClickHouse on
08:27:48  dbt_clickhouse adapter: an operating system that does not support the low level renameat2() system call.
08:27:48  dbt_clickhouse adapter: Some DBT materializations will be slower and not atomic as a result.
08:27:49  dbt_clickhouse adapter: Unexpected server exception dropping table
08:27:49  
08:27:49  Finished running  in 0 hours 0 minutes and 2.68 seconds (2.68s).
08:27:49  Encountered an error:
Database Error
  'NoneType' object has no attribute 'version_major'
make: *** [dbt-run] Error 2

Configuration

Environment

ClickHouse server

simpl1g commented 3 months ago

Please run dbt with debug option to see full logs or look into ClickHouse query log to see the error

dbt run --target=clickhouse-dev --select=dbt_mart --debug
or
SELECT query, exception FROM system.query_log WHERE exception != '' and event_time ...
LPauzies commented 3 months ago

Thanks @simpl1g for the response, here is a more detailed stack trace of the failed execution.

dbt run --target=clickhouse-dev --select=dbt_mart --debug
13:09:16  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1196f5510>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x119945b50>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x119947cd0>]}
13:09:16  Running with dbt=1.7.17
13:09:16  running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'profiles_dir': './transform_dbt', 'fail_fast': 'False', 'warn_error': 'None', 'log_path': 'transform_dbt/logs', 'debug': 'True', 'version_check': 'True', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'log_format': 'default', 'static_parser': 'True', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'invocation_command': 'dbt run --target=clickhouse-dev --select=dbt_mart --debug', 'target_path': 'None', 'introspect': 'True', 'send_anonymous_usage_stats': 'True'}
13:09:17  Sending event: {'category': 'dbt', 'action': 'project_id', 'label': 'b49fbec0-9cf8-4e66-b6fb-89914e1ea015', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x119295910>]}
13:09:17  Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': 'b49fbec0-9cf8-4e66-b6fb-89914e1ea015', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1196f77d0>]}
13:09:17  Registered adapter: clickhouse=1.7.7
13:09:17  checksum: 2a0b96f78ef5402e998d105a8570f3d1fecbe0683b92582b64db2b1f3758d22f, vars: {}, profile: , target: ch-off, version: 1.7.17
13:09:18  Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
13:09:18  Partial parsing enabled, no changes found, skipping parsing
13:09:18  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': 'b49fbec0-9cf8-4e66-b6fb-89914e1ea015', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11a0fe050>]}
13:09:18  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': 'b49fbec0-9cf8-4e66-b6fb-89914e1ea015', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x119c56950>]}
13:09:18  Found 65 models, 3 seeds, 280 tests, 85 sources, 0 exposures, 0 metrics, 942 macros, 0 groups, 0 semantic models
13:09:18  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'b49fbec0-9cf8-4e66-b6fb-89914e1ea015', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11a5a9890>]}
13:09:18  
13:09:18  Acquiring new clickhouse connection 'master'
13:09:18  Acquiring new clickhouse connection 'list_'
13:09:18  Opening a new connection, currently in state init
13:09:20  dbt_clickhouse adapter: ClickHouse server does not support the EXCHANGE TABLES command
13:09:20  dbt_clickhouse adapter: This can be caused by an obsolete ClickHouse version or by running ClickHouse on
13:09:20  dbt_clickhouse adapter: an operating system that does not support the low level renameat2() system call.
13:09:20  dbt_clickhouse adapter: Some DBT materializations will be slower and not atomic as a result.
13:09:20  dbt_clickhouse adapter: Unexpected server exception dropping table
13:09:20  dbt_clickhouse adapter: Error running SQL: macro list_schemas
13:09:20  Connection 'master' was properly closed.
13:09:20  Connection 'list_' was left open.
13:09:20  On list_: No close available on handle
13:09:20  
13:09:20  Finished running  in 0 hours 0 minutes and 1.66 seconds (1.66s).
13:09:20  Encountered an error:
Database Error
  'NoneType' object has no attribute 'version_major'
13:09:20  Resource report: {"command_name": "run", "command_wall_clock_time": 4.1928444, "process_user_time": 2.90538, "process_kernel_time": 0.842324, "process_mem_max_rss": "162955264", "command_success": false, "process_in_blocks": "0", "process_out_blocks": "0"}
13:09:20  Command `dbt run` failed at 15:09:20.746088 after 4.19 seconds
13:09:20  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1196f6f90>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x119957d90>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x119957f50>]}
13:09:20  Flushing usage events
make: *** [dbt-run] Error 2
genzgd commented 3 months ago

the rename2 message is just information. This is almost certainly an unrelated bug. Can you create a small reproducible example from your project?

genzgd commented 3 months ago

I don't understand why these messages don't print more details :(

13:09:20  dbt_clickhouse adapter: Unexpected server exception dropping table
13:09:20  dbt_clickhouse adapter: Error running SQL: macro list_schemas

Is there anything in the ClickHouse server logs that shows some DROP related exception?

LPauzies commented 3 months ago

Hello @genzgd, good idea ! I got something like :

SELECT
    query,
    exception
FROM system.query_log
WHERE (exception != '') AND (event_time >= '2024-08-16')

Query id: af829311-a81f-4eb0-91a0-ccb30177a97e

Row 1:
──────
query:     EXCHANGE TABLES __dbt_exchange_test_0_6ba9cdd65ba911efb5164aefc927b229 AND __dbt_exchange_test_1_6ba9cdd65ba911efb5164aefc927b229
exception: Code: 497. DB::Exception: user: Not enough privileges. To execute this query, it's necessary to have the grant SELECT, DROP TABLE ON dbt.__dbt_exchange_test_0_6ba9cdd65ba911efb5164aefc927b229. (ACCESS_DENIED) (version 23.8.4.69 (official build))

So I went to system.users and saw that my user had not the same rights as me. However, this is not normal to have such an error message that blurred the real problem. Is there a possibility to change this and make it more reliable ? I let you guys close the issue then :+1:

genzgd commented 3 months ago

Agreed that we should be catching that error and printing something more useful in the logs. We'll take a look.

LPauzies commented 3 months ago

Super ! Thanks @genzgd !