delta-io / delta-kernel-rs

A native Delta implementation for integration with any query engine
Apache License 2.0
152 stars 42 forks source link

Restrict enabling CDF on tables with reserved column names #524

Open OussamaSaoudi-db opened 3 days ago

OussamaSaoudi-db commented 3 days ago

Please describe why this is necessary.

Change data feed generates three columns: _commit_timestamp, _commit_version, and _change_type. These columns are generated by TableChanges.

TableChanges assumes that these column names do not exist. Moreover spark disallows enabling CDF on tables with these column names.

When writing tables, we must ensure that CDF cannot be enabled on tables with the reserved column names.

Describe the functionality you are proposing.

Ensure that when writing tables or metadata updates, a table may not have CDF enabled and have the reserved columns.

Additional context

No response

scovich commented 3 days ago

Also we need to block adding those columns to a CDF-enabled table