Open dtiesling opened 9 months ago
It looks the default method to create tables in Snowflake uses the False flag for CHANGE_TRACKING
and the solution is to implement the ALTER TABLE t1 SET CHANGE_TRACKING = TRUE;
(Snowflake documentation example page). I'm not familiar with this feature and the impact it can cause in the current connector implementation. I asked to the destination team take a look, thanks for opening the discussion @dtiesling to make the connector better 🥇
As a note, the reason this happens is that a full refresh sync is accomplished by creating a new table each sync, so that while we are loading up the new table, the old table and data is still useable. The last step of the sync is a begin; drop table old, rename table new to old; commit
@dtiesling - do you happen to have any insight into how much more Snowflake will charge to enable change tracking? It looks like under the hood this would add 2 'virtual' columns to the table, so there would be increase storage cost... but what about compute cost to calculate this data on write?
Furthermore, with the underlying tables actually being dopped under the hood, would change tracking work?
@evantahler Change tracking will work since the dynamic table just require that it's enabled. If there is no history they will just do a full write.
That being said it would be good to confirm that the dynamic table still works after the drop/rename. I'm not 100% confident it won't throw a different error after CHANGE_TRACKING is flipped back on. I can do some testing next week. If you'd like to do any testing on your side here's a simple set of commands to create a dynamic table and test the refresh.
CREATE OR REPLACE DYNAMIC TABLE dynamic_table_test as SELECT * FROM <airbyte_table>;
ALTER DYNAMIC TABLE dynamic_table_test REFRESH;
re: pricing - The change tracking is a cloud service layer feature so there should be no metered cost associated. There would only be the marginal increase in storage cost for the data.
Sorry to have let this die. We did confirm that the dynamic tables will continue to work when the table is dropped and remade. Our current workaround is that we have a snowflake job cron'd that enables change tracking on all of the airbyte tables. Is there any appetite to add an option to airbyte to enable change tracking or just set it true on all of the tables?
Hey all! I just did a quick PR to add it in as an option within Airbyte. This is something that our team really needed as well, and like was said above, I think that Snowflake is going to push it even more!
Connector Name
destination-snowflake
Connector Version
3.4.22
What step the error happened?
During the sync
Relevant information
In my Snowflake data pipeline I am using Dynamic Tables on top of the destination tables created by Airbyte. The dynamic tables rely on CHANGE_TRACKING being enabled. Each time I trigger a "full overwrite" sync the new Airbyte tables have CHANGE_TRACKING disabled and it breaks the Dynamic Table.
Snowflake is pushing the Dynamic Table feature heavily so I suspect this will start to impact more and more users.
It looks like it could possibly be a one-line update to the sql statement that creates tables in the connector if this does not need to be configurable on the destination.
Relevant log output
No response
Contribute