delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.58k stars 1.7k forks source link

[doc] NOT NULL in ALTER TABLE ADD COLUMNS is not mentioned in the doc #3360

Open li-boxuan opened 4 months ago

li-boxuan commented 4 months ago

I understand this might be by design, but https://docs.delta.io/latest/delta-batch.html#add-columns could be made more clear. It currently says,

By default, nullability is true.

image

which gives me an impression that one can override the nullability, but I got

spark-sql (default)> ALTER TABLE basic ADD COLUMNS (col1_nonnull2 integer NOT NULL);
[DELTA_OPERATION_NOT_ALLOWED] Operation not allowed: `NOT NULL in ALTER TABLE ADD COLUMNS` is not supported for Delta tables

Also, note that due to #831, you cannot change an existing nullable column to non-nullable. This means the ONLY way to have a non-nullable column is to do so when creating the table.

Version:

        :: modules in use:
        io.delta#delta-spark_2.12;3.1.0 from central in [default]
        io.delta#delta-storage;3.1.0 from central in [default]
        org.antlr#antlr4-runtime;4.9.3 from central in [default]
rishabhchaudha commented 3 months ago

I've noticed that an error is intentionally thrown when specifying 'Not Null' in the code. This decision should be by design , especially during schema alterations where the default value for new columns is logically null.

image
li-boxuan commented 3 months ago

This decision should be by design , especially during schema alterations where the default value for new columns is logically null.

Yeah I agree with that. I would consider it as a glitch in the doc to be improved.

MrPowers commented 3 months ago

@li-boxuan - thanks for creating this issue. Would you like to submit a pull request to update the language of the docs?