snowplow / iglu-central

Contains all JSON Schemas, Avros and Thrifts for Iglu Central
http://iglucentral.com
Apache License 2.0
118 stars 114 forks source link

Release/r161 #1412

Closed matus-tomlein closed 1 month ago

matus-tomlein commented 1 month ago

This release contains the schema for the client_session entity version 1-0-3 that supersedes the 1-0-2 version. It adds the $supersededBy relation to the 1-0-2 version and the supersedes relation to the 1-0-3 version. The change in the 1-0-3 schema is that the previousSessionId property is no longer required (not a breaking change since it was nullable).

Changes

Needs extra attention as this is the first time that we use the supersedes relation in Iglu Central and the client_session schema is a popular one.

jbeemster commented 1 month ago

@matus-tomlein @istreeter silly question here but if we are going to need to patch 1-0-2 regardless with the $supersededBy relation is it not simpler to just remove the required condition on 1-0-2 as part of the patch and not introduce any superseded logic anywhere (nor 1-0-3).

Why do we prefer this approach to just patching 1-0-2?

matus-tomlein commented 1 month ago

@jbeemster Happy to go with just patching 1-0-2, I don't have a strong preference. Originally I thought we could just get by with the supersedes relation in the 1-0-3 schema, but it's true that since we are patching, we can just patch the 1-0-2.

istreeter commented 1 month ago

In terms of risk... both approaches are equally safe in theory with roughly equal amount of risk if we get something wrong. Or possibly slightly less risk to do Josh's latest idea of patch the schema.

On the other hand: Going forwards we want to take a position with customers/users where we strongly discourage (or even forbid) patching their schemas. So it's not great to have a rule for customers if we break the rule in Iglu Central.

On the 3rd hand: Consider BigQuery v1 users. As soon as you merge this PR, all existing BigQuery v1 loaders will suddenly start loading the context into a 1-0-3 column where previously it went to a 1-0-2 column. I know the snowplow dbt models handle that ok, but maybe not the customer's other warehouse queries.

That final argument makes me lean towards patching the schema instead of superseding it.

jbeemster commented 1 month ago

@istreeter @matus-tomlein assuming the patch is safe lets do that rather than make a much bigger amount of work out of all of this.

matus-tomlein commented 1 month ago

Sounds good! Have just updated the PR to only have the 1-0-2 patch.