[core][spark] Support adding and dropping nested columns in Spark

apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

https://paimon.apache.org/

Apache License 2.0

2.43k stars 955 forks source link

[core][spark] Support adding and dropping nested columns in Spark #4483

Closed tsreaper closed 1 week ago

tsreaper commented 1 week ago

Purpose

Spark SQL supports adding and dropping nested columns, but Paimon currently does not support it. This PR adds support for adding and dropping nested columns.

Tests

Unit tests and IT cases.

API and Format

No format changes.

Documentation

This feature does not need a document.

JingsongLi commented 1 week ago

Try to support this in Flink SQL too:

Flink SQL> create table s1 (a int, nc row<f1 int, f2 string> ) with ('connector'='datagen', 'number-of-rows'='10');
[INFO] Execute statement succeed.

Flink SQL> alter table s1 modify nc row<f1 int, f2 string, f3 string>;
[INFO] Execute statement succeed.

tsreaper commented 1 week ago

Try to support this in Flink SQL too:

Flink SQL> create table s1 (a int, nc row<f1 int, f2 string> ) with ('connector'='datagen', 'number-of-rows'='10');
[INFO] Execute statement succeed.

Flink SQL> alter table s1 modify nc row<f1 int, f2 string, f3 string>;
[INFO] Execute statement succeed.

Let's implement this in another PR.