delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.41k stars 1.66k forks source link

[Feature Request] Column default values for Delta Lake #2238

Closed dtenedor closed 7 months ago

dtenedor commented 10 months ago

Feature request

Overview

This is a proposal to support column default values for Delta Lake tables. Users should be able to associate default values with Delta Lake columns at table creation time or thereafter.

Motivation

Support for column defaults is a key requirement to facilitate updating the table schema over time and performing DML operations on wide tables with sparse data.

Further details

Please refer to an open design doc here.

This should integrate with Apache Spark's column default feature, and also represent the column metadata in a general way such that other Delta Lake clients can understand it.

For example:

-- The CREATE TABLE statement may specify a DEFAULT value for a column.
CREATE TABLE T (a INT, c STRING DEFAULT CONCAT(‘abc’, ‘def’))
  USING DELTA;
INSERT INTO T(a) VALUES (42);
INSERT INTO T VALUES (43, DEFAULT);
SELECT * FROM T;
(42, ‘abcdef’)
(43, ‘abcdef’)

Willingness to contribute

The Delta Lake Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?

felipepessoto commented 10 months ago

I can't add comments to the doc.

Could you please add an example of metadata in the Delta Log?

dtenedor commented 10 months ago

@felipepessoto Thanks for joining the thread, I will add some examples. (The only way I was able to share the doc for now is by publishing it to the web as read-only, please feel free to make comments here.)

I am preparing a protocol doc change as well as the first pull request :) I can add you to the reviewers if you prefer.

felipepessoto commented 10 months ago

Yeah, I added some comments to #2240

Thanks!

felipepessoto commented 10 months ago

BTW I found that Google docs allows to share it in two different ways. To be able to Print/Download and optionally add comments permission you need to use this:

image

dtenedor commented 7 months ago

This is done.

ash123ok45 commented 5 months ago

Using create table statement to set default literal for columns along with not null and comment, I still get error. Create table table_name( columnA STRING NOT NULL default '000' comment 'sadhksjhd', [WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value, but the corresponding table feature was not enabled. Please retry the command again after executing ALTER TABLE tableName SET TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported').