Snowflake-Labs / schemachange

A Database Change Management tool for Snowflake
Apache License 2.0
481 stars 219 forks source link

Add Always Execute First Functionality #164

Open CaptCorpMURICA opened 1 year ago

CaptCorpMURICA commented 1 year ago

Description:

Add the functionality to execute specific scripts first on every execution of schemachange using the naming convention F__DESCRIPTION.sql.

Benefit:

For use in CI/CD process:

  1. Create clone of production environment to create the QA environment.
  2. Apply permissions required for the QA environment.
  3. Execute rest of schemachange workflow to apply deployment script to QA environment. If this fails, then the deployment would fail a production execution.

By tagging scripts to be executed first, schemachange can create the cloned environment and apply the appropriate permissions before it attempts to deploy any changes.

Best Case Enhancement:

The functionality would be improved if an optional condition could be applied so it is only executed against QA and not to PROD, since it would not be needed against that environment. For instance, if the naming convention was F__QA_DESCRIPTION.sql, then it would only run against the QA environment and F__STG_DESCRIPTION.sql would only run against the staging environment. This enhancement is not part of the MVP of this feature but would require schemachange to analyze the database to match a REGEX for which files to run or omit. It would not work unless specific naming conventions were utilized in Snowflake and is out of scope for this initial enhancement.

This could also be implemented with a command line argument like, -af, --always-first or in the YAML file of always-first: TRUE. This would enable the YAML for the Dev to QA pipeline to leverage these files but would not include it in the QA to Prod pipeline.

Currently, we use SnowSQL to execute stored procedures for cloning the environment and applying the grants. Once that completes, then the schemachange execution occurs. It would be ideal to merge this functionality with schemachange to keep the entire process contained into a single tool and reduce additional technical debt.

sfc-gh-tmathew commented 9 months ago

Hello @CaptCorpMURICA

Thank you for reaching out. Could you take a look at #110 and see if you can use the always first script using the jinja template and everything else after the first script in the included script. Does that accomplish the goal without adding additional switches to the command line?

Thoughts?

CaptCorpMURICA commented 9 months ago

Good afternoon @sfc-gh-tmathew,

Leveraging jinja templates would be a method that could be used; however, it requires the cloning script to be declared in every template. We are currently leveraging SnowSQL to call the cloning script at the start of the CI/CD process in the lower environments, and then we skip that stage for deployment to production. The current solution works, but it is less efficient because it requires using SnowSQL instead of keeping everything within schemachange. The jinja template method would be less efficient than the current method due to the failure of the "Don't Repeat Yourself" methodology. This added functionality to schemachange would allow the developers to enable these scripts for lower environment pipelines, but not production. It keeps within the imperative-style approach while also conforming to database development standards of minimizing code repetition to reduce the potential for error. If the options are between SnowSQL or the jinja templates, we would continue to utilize SnowSQL due to the efficient execution methodology.

Sincerely, Kevin Zufelt

sfc-gh-tmathew commented 8 months ago

Thank you for the explanation. We will review this functionality and get back to you as mentioned in the PR #173.

sfc-gh-tmathew commented 7 months ago

@CaptCorpMURICA

As we discussed internally, how would we handle order of multiple First scripts?

Rather than add a new naming convention, Would you consider the switch to point a folder location instead of file type? In a cicd pipeline, this switch can point to different folders, dev, qa, stg etc. and solve your environment requirement as well.

say the switch --always-first points to a folder that contains script or scripts to be executed first.

thoughts?

CaptCorpMURICA commented 7 months ago

@sfc-gh-tmathew, that would be a viable solution, but it shouldn't be necessary. For our environment, we have our migration scripts in a separate folder solely for organizational reasons, but schemachange looks for F__*.sql in all folders. Just like repeatable scripts, always first scripts are executed in alphabetical order. Having --always-first point to a specific folder is a good solution though since you can have a folder for QA migrations and a separate one for STG or PRD migrations depending on the needs of the business.

sfc-gh-tmathew commented 7 months ago

Thank you for you consideration @CaptCorpMURICA. Would you be able to modify your contribution to have --always-first switch to be a folder instead of a file type ?