Closed jorritsandbrink closed 1 month ago
Name | Link |
---|---|
Latest commit | e16945591fbca429cb4ec35aa0b25a6e2109b821 |
Latest deploy log | https://app.netlify.com/sites/dlt-hub-docs/deploys/66ccf9932c7d810009062456 |
@rudolfix
if arrow_ds is empty you do not evolve the schema. IMO that should happen. please add a test for it (if arrow_ds.head(1).num_rows == 0:)
Done.
should we update all table schemas like in other destinations where it happens in update_stored_schema? if you agree let's create a ticket for that
Three options:
update_stored_schema
is a good idea.We currently do 3. 1 is not possible yet, but might become possible when the linked tickets are done (they are already assigned, so could be soon). 2 is possible, but is a bigger burden on our side. Which has your preference?
same thing for truncating tables before the load. this is actually used by refresh option
Okay, then we should probably use it.
@jorritsandbrink
So what I'd do:
in update_stored_schema
in truncate / drop tables
refresh
options should work after that)migrating schema You already have all the building blocks for (2) and it IMO makes sense to migrate tables before we start loading but the priority is low.
Description
merge
write disposition with thedelta
table formatdeltalake
version to accessadd_columns
methodget_delta_tables
for pipelines with multiple schemas (that may explain problem with "missing" delta tables)Related Issues
Fixes #1739