ray-project / deltacat

A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
Apache License 2.0
166 stars 23 forks source link

Support null as a valid primary key value #352

Closed raghumdani closed 2 months ago

raghumdani commented 2 months ago

This PR supports tables with null primary keys. We observed one table had null primary key and the schema specifying primary key can be null. This breaks the primary key definition. However, in the short term we will follow spark compaction behavior and in the long term we will rollback to the current behavior where it's not supported.