ClickHouse / dbt-clickhouse

The Clickhouse plugin for dbt (data build tool)
Apache License 2.0
254 stars 113 forks source link

Adding new unique key column - fixes #322 #323

Closed the4thamigo-uk closed 3 months ago

the4thamigo-uk commented 4 months ago

Summary

Proposed solution to support adding new columns that are also added to the unique_key.

CLAassistant commented 4 months ago

CLA assistant check
All committers have signed the CLA.

the4thamigo-uk commented 4 months ago

fixes #322

BentsiLeviav commented 3 months ago

Hi @the4thamigo-uk Thank you for your contribution!

This https://github.com/ClickHouse/dbt-clickhouse/pull/332 was opened a few days ago and handles schema changes. Generally, support in the sync_all_columns option was introduced. Could you check if it solves your issue?

Thanks

the4thamigo-uk commented 3 months ago

@BentsiLeviav Ah yes I noticed the legacy fallback, but I was aiming to do a minimal fix. Great that there is a better solution.

I suppose if the PR passes a test which is equivalent to https://github.com/ClickHouse/dbt-clickhouse/pull/323/files#diff-f2e636456ac1c2366b5fd10e6beaf3560a946882645f7cee6ba0e0501bf301e3 then it would be fine.

BentsiLeviav commented 3 months ago

I see that the table creation passes successfully, but is there a chance there is a mistake in this assertion? After running the test the table looks like this: image

the4thamigo-uk commented 3 months ago

I see that the table creation passes successfully, but is there a chance there is a mistake in this assertion? After running the test the table looks like this: image

Yes its interesting to know what the behaviour should be I suppose. I guess you are suggesting row 4 should not be present?

the4thamigo-uk commented 3 months ago

HMm when I run it I dont see your row 4?

$ pytest -k test_append_unique_key
==================================================================================================================== test session starts =====================================================================================================================
platform linux -- Python 3.12.3, pytest-8.3.2, pluggy-1.5.0
configfile: pytest.ini
testpaths: tests/integration, #, name, per, convention
collected 159 items / 158 deselected / 1 selected                                                                                                                                                                                                            

tests/integration/adapter/incremental/test_schema_change.py 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB set_trace (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /home/xxx/venatus/dbt-clickhouse/tests/integration/adapter/incremental/test_schema_change.py(90)test_append_unique_key()
-> assert result[0][2] == 0
(Pdb) result
[(0, 1, 0), (1, 2, 0), (2, 3, 4), (3, 4, 5), (4, 5, 6)]
BentsiLeviav commented 3 months ago

@the4thamigo-uk What ClickHouse version do you use?

the4thamigo-uk commented 3 months ago

@the4thamigo-uk What ClickHouse version do you use?

24.3.4.147 - but would be surprised if it makes a difference...

BentsiLeviav commented 3 months ago

That is really weird. Did you test it with the changes contributed at #332 ? At the beginning, the table has the following data:

image

and the incremental query:


select
    number as col_1,
    number + 1 as col_2,
    number + 2 as col_3
from numbers(2, 3)

yields this:

image

So the append of these 2 groups would be 6 rows:

image

Anyway, the error mentioned in #322 doesn't appear anymore. I'm closing this PR and the issue, feel free to comment/reopen if needed or if something is wrong.

the4thamigo-uk commented 3 months ago

Yeah weird... anyway, the incremental schema changes is much better. Thanks @canbekley