Snowflake-Labs / schemachange

A Database Change Management tool for Snowflake
Apache License 2.0
502 stars 225 forks source link

Add undo script functionality #158

Closed eaguilera23 closed 6 months ago

eaguilera23 commented 1 year ago

Taking inspiration from Flyway, this commits adds the undo subcommand to the application, with the --step <n> flag.

Undo scripts will have the U prefix, with the same name and version as their corresponding versioned script. It will attempt to undo amount of versioned scripts until it finishes, or no corresponding undo script is found.

Under the hood, the application is removing the V script from the change history table, so when the deploy command is ran, the versioned script can be applied again.

Algorithm:

This commit closes issue #19

eaguilera23 commented 1 year ago

Couldn't add you as a reviewer, so I'll tag you for visibility @sfc-gh-jhansen @MACKAT05

Let me know if you'd like me to change something. Looking forward on enhancing this useful project

eaguilera23 commented 1 year ago

Hi @sfc-gh-tmathew. Would it be possible for this PR to be reviewed?

MACKAT05 commented 1 year ago

@eaguilera23 Some thoughts.
not all operations are undoable.
For this methodology / functional pattern, an undo file is required for every V file created. philosophically the expectation would be that those would be prepared at the same time as the original V file... creating double work, and double testing since it would be poor practice not to test the undo script as well as the deploy scripts.
There is some agreement with my initial sentiments on SO: https://stackoverflow.com/questions/4959299/how-to-roll-back-migrations-using-flyway

Also It looks like flyaway has some special spice to allow it to interpret scripts and generate an undo script automatically. Currently schema change throws the whole rendered file at it and does not interpret scripts statement by statement. so little issues like a comment at the end of a file breaks deployment.

for this functionality there are a couple of alternative approaches. instead of a U script for v1.1.1 and v1.1.2 just create a v1.1.3 that does the functional equivalent and only when needed... the v1.1.1 and v1.1.2 files can be cloned and re versioned to v1.1.4 and 5 to go down the other branch in the repository so history is preserved. Or per the documentation of fly away maintain a clone of the repository at key points to rollback to as part of a robust recovery plan.

Is there something process wise in your experience that would not allow that alternative (and some what standard if not archaic feeling) pattern.

I had some architecture level thoughts about gathering some of the script ordering functionality into another class to make it exposable for other projects for the purpose of wrapping a UI or web interface around this for interactive deployment and debugging, along with a script to statements splitter and creating two simultaneous connections ( one for applying scripts and one for logging)

It would seem at the point we can interpret the scripts statement by statement we could do a much more robust version of this feature that would not create a burdensome patterns.

When I get some time this weekend I can take a closer look.

sfc-gh-tmathew commented 11 months ago

@eaguilera23 Catching up on open items and PRs. As @MACKAT05 pointed out, the Undo script will create more work than if we train users to follow the change forward approach.. aka alter from current state to next instead of trying to rollback. creating clones as milestones for rollback purposes in lower environments would be the recommended approach rather than support Undo scripts.

Thoughts?

podung commented 10 months ago

@sfc-gh-tmathew - I've been exploring some similar ideas. I think Snowflake as a tech stack presents a unique solution for this.

What I'd like my team to implement is this: If any migration fails, rollback the entire impact of that schemachange run to its state just prior to the schemachange process starting. As long as we know the start timestamp for the first schemachange script to be executed as a part of a run, it would be simple to use Snowflake's time travel to just reset the entire schema to that point in time.

I haven't explored the ideas fully, but here are some considerations:

  1. A new column on the change history table could be used to track a "start" timestamp (so the timestamp prior to it executing the script)
  2. If we implemented the first item, then my team would still need to try to figure out how which migration was the first migration for that particular schemachange invocation
  3. Alternatively, my team could orchestrate the "time start" logic via of CI/CD (track the time we invoked schemachange)
  4. Overall this general approach seems super solid for downtime deploys and deployments within pre-defined windows - however, it would not be a safe approach for a zero-downtime deploy scenario, as it would lose any data that was generated during the deploy window.

Regardless of how we determine the "rollback" timestamp - I think any orchestration of performing an actual rollback does not belong in the schemachange library, but should be handled by our CI/CD process or some tooling built to orchestrate it. But I could see an implementation guide for rollbacks being a useful community resource in a documentation section.

Would love any feedback on my thinking above.

sfc-gh-tmathew commented 10 months ago

@podung Changing the structure of the change history table will require careful consideration for those who have established pipelines to not break. Consider the default value that will be set before the column existed in the table. This will have to be part of a major release and carefully considered.

checkout the original discussion #19 where one of the comment talks about leveraging the root folder switch = /.rollback and have the rollback scripts in there and apply it as conditional step in the orchestration process. Still apply the change forward but effectively rollingback changes using timetravel in snowflake.

I am not in favor of removing the entry from the Change history table. We would want to capture the undo operation as much as the apply operation without loosing the activity on the database. By deleting the record from change history, we loose the history.

sfc-gh-tmathew commented 6 months ago

Thank you for your patience and contributions. We have decided not to pursue this functionality at this time.