Closed the4thamigo-uk closed 5 months ago
hmm, it always seems to fully repopulate the entire table every time I run...
I suggested an idea as a PR for discussion.
I have a little bit more broad issue with materialised views which includes the topic of the question.
I use material views over very large tables, so when I try to run a new material view model in dbt, it tries to populate it by running an insert into
query.
In my case, it is very large so I Clickhouse fails to run it. For some reason, I expected that it would just create an mv and target table without populating it.
The current logic is also potentially prone to handling data, which will come moments in time between the populate query and creating materialized view, as both operations are not transactional and the populate query takes some time to run.
The potential workaround for me right now is to avoid this big populate query by adding additional conditions into model SQL that would filter only "fresh data".
The overall issue looks very similar to already existing challenges in incremental models which are handled by is_incremental
logic. Maybe we should use the same approach both in incremental and materialised_view models?
It would include the issue with unsupported --full-refresh logic, as it is a part of expected incremental model behavior
Maybe I would even expect the materialized view to be a special subtype of the incremental model, where "increments" are handled by created mat view instead of dbt runs.
@tema-popov I also have similar concerns about missing out some data. With the --full-refresh
repopulation switched off, there is still a small time gap between dropping the MV and recreating it, in which time data might be missed.
Maybe https://clickhouse.com/docs/en/sql-reference/statements/alter/view should be used to alter the MV query without interrupting the ingeation process?
@tema-popov The PR for this is now merged. Maybe you want to raise a new issue for the problem you are facing?
@the4thamigo-uk sure, it is a far broader topic, I agree it should be discussed separately.
I'm seeing my materialized views being completely recreated from scratch and fully bakfilled, more than necessary. What logic determines when this happens?