snowplow-incubator / common-streams

Other
1 stars 0 forks source link

Omit parquet field for a schema with no nested fields #74

Closed istreeter closed 4 months ago

istreeter commented 4 months ago

Snowplow users commonly use schemas with no nested fields. Previously, we were creating a string column and loading the string field {}. But there is no benefit to loading this redundant data.

By omitting a column for these schemas, it means we support schema evolution if the user ever adds a nested field to the empty schema.

For empty schemas with additionalProperties: true we retain the old behaviour of loading the original JSON as a string.