Closed shanejearley closed 1 year ago
@hawyar you mentioned you had read some notes around Glue versioning. We can add some simple strategy to our analytics deployment. Also feel free to rename all ETL names to analytics so it's more accurate.
Regarding schema versioning, first we want to create a schema registry casimir_schema_regsitry
managed by Glue. Then we create 3 schemas event
, wallet
, and staking_action
as we have it in JSON Schema. When creating those schema in Glue we have to also choose a backward compatibility mode to dictate what happens when we delete/add/update fields or their types. I have chosen "no compatibility" which gives us flexibility for now but please advise here. Then CDK would pick up the schema from the registry to create or update table.
Passing major version number of @casimir/data to bucket and table naming. Schema changes require a major version bump (else deploy will clash and fail). Maybe we can add an auto check for this beforehand; for now we can check in review. Note, forgot to add bucket name to output bucket, but we may remove this from CDK anyways. Thank you @hawyar
New schemas require new buckets, but tables should be updateable. Should be a simple fix to the table definition.