databricks / iceberg-kafka-connect

Apache License 2.0
219 stars 49 forks source link

Improvements to inferring schema with null and empty values #162

Closed bryanck closed 1 year ago

bryanck commented 1 year ago

This PR updates schema inference so that empty values are ignored and are not added as part of table create or schema evolution. Empty values include nulls, lists with no elements, lists with the first element as null, lists with the first element as an empty object, objects with no fields, and objects with all fields set to empty values.

When possible, using a message schema is strongly preferred over relying on schema inference.