Open ion-elgreco opened 12 months ago
In principle I have no too strong feeling about bundling some commands under a common property, much like we do for optimize.
Given the name alter
though, I would suggest restricting it to things things that can be done via the ALTER TABLE
command, as we may want / need to implement that operation at some point.
The way things seem to be going with the Delta Protocol, it seems table features are front and center when it comes to configuring tables. Along with that some configuration is becoming more complex. As such we may consider exposing set_table_properties
as a low level (discouraged) API only and instead model this around table features. Something along the lines of
def enable_feature(name: FeatureName, config: dict)...
The advantage may be that is is easier for us to validate the configuration as configuration for specific features may include multiple keys that need to be consistent and (as far as I understand) may even require setting domain metadata at some point.
@roeap it's mostly inspired from the SQL alter operations: https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-alter-table.html
I am not entirely following you on why set_table_proprties should be a low-level API. Because not every table property belongs to a certain feature, right? : https://books.japila.pl/delta-lake-internals/DeltaConfigs/#appendOnly
yes, there is config that is unrelated to features... mainly saying that the config that is related to features should maybe modeled as such ...
@roeap I am going to start looking to this soon, just want to clarify one thing; For configs that are related to features, should we raise when someone tries to add or remove them in set table propeties way?
Does this issue cover adding support for DDL statements in general, such as CREATE TABLE and ALTER TABLE ...? Currently only possible with spark.
Does this issue cover adding support for DDL statements in general, such as CREATE TABLE and ALTER TABLE ...? Currently only possible with spark.
Create table is already covered with the create
operation.
Some alter operations are already available, this is still a work in progress to add more, such as add columns
operation
Complete support for alter operations would make this project useful for lightweight migrations, omitting the need for a spark cluster to perform them
Description
Use Case Eventually, we will have multiple alterations possible on the table, such as setting/unsetting table properties, adding and removing columns and so forth. We can cluster these nicely together under a single namespace called
alter
. The API will look like this:Related Issue(s) https://github.com/delta-io/delta-rs/issues/1663