Add support for merging a registry with a modified plan

acrobat commented 10 years ago

I have a problem with sqitch plans (and databases) between different versions. We currently have a 2.0 branch and a master branch (which represents a future 2.1 version) and there will be changes on both branches. Changes on the 2.0 branch will be merge into master (bugfix changes) and master will have feature changes, but when merging branches it isn't possible to deploy changes with sqitch. This is an example case:

2.0 branch is created, this is the state:

2.0 branch	master branch
change A	change A
change B	change B
change C	change C

Bugfixes are created on 2.0 and new features developed on master. A database per version is build from scratch

2.0 branch	master branch
change A	change A
change B	change B
change C	change C
bugfix A	Feature A

We merge 2.0 in master (to transfer bugfixes to master). This is the result:

2.0 branch	master branch
change A	change A
change B	change B
change C	change C
bugfix A	Feature A
	bugfix A

But here is where the problem occurs, the merge like this is correct for the master database. Bugfix A must be include after feature A, because if bugfix A is before feature A sqitch deploy will exit with an error that it can not find bugfix A in the plan file (quite strange error, but it's an wrong error message i think)

So this will block us from upgrading databases from version 2.0 up to 2.1

How can we solve this problem or can we find a solution in sqitch so that the order of the plan file doesn't matter but it just follows the requirements. I hope my explanation is a bit clear otherwise i will be more than happy to explain it a bit more!

Thanks in advance

theory commented 10 years ago

Nice use of tables!

FWIW, though, the mail list is probably a better place for a question like this.

Anyway, I think the thing to do is to rebase master on top of the 2.0 branch, like so:

sqitch checkout master
sqitch revert --to change_c
git rebase v2.0
sqitch deploy

This ensures that the state of the database is as as the master branch would understand it. Then you revert the changes added to master since v2.0 branched. Then rebase on v2.0. That will revert the Feature A commits, then apply commits from the time v2.0 was branched to v2.0's HEAD, meaning bug fix A. Then it reapplies commits created since v2.0 was branched, meaning it adds Feature A back in. At that point, you should be able to deploy as usual.

HTH!

theory commented 10 years ago

Another approach that will do the same thing, but doesn't require you to remember the name of the last change common to both v2.0 and master:

sqitch checkout v2.0
git checkout master
git rebase v2.0
sqitch deploy

acrobat commented 10 years ago

Thanks for the tip about the mailinglist! Yes i was aware of the rebase command and that's an approach which does work in a development environment, but how to solve this with production databases? A client application has started with version 2.0 and now wants to upgrade to 2.1, how would you solve that?

theory commented 10 years ago

I would follow the process outlined above. At that point, the master branch is ready for a 2.1 release. Create a new 2.1 branch and do the release from there:

git checkout -b v2.1
sqitch tag v2.1 -n 'Tag v2.1.'
sqitch bundle --dest-dir ~/widgets-v2.1.0

Now master is the path to v2.2. If you make a change to v2.0, you'll need to rebase v2.1 from v2.0, then rebase master from v2.1.

acrobat commented 10 years ago

But how would this work for production databases? Because a rebase could potentially mean data loss for that db.

The main problem, i think, is that by merging bugfix sqitch scripts from 2.0 into master (2.1) the sqitch.plan file get's out of order compared to databases which were build from 2.0 and now need 2.1. If sqitch could just follow the requirements it would not be a problem how the lines were in the plan file, the user just need to make sure that their requirements are correct

theory commented 10 years ago

This pattern assures that changes are always in order in the master branch. I am assuming that you don't deploy to production every time you change master. What you do is deploy only when you do a release.

A more useful pattern might be to keep a develop branch for forward-looking work, and keep master in final release state. In this case, if master was currently the v2.0 release, and you made changes to the develop branch towards v3.0, and to the v2 branch for maintenance, when you were ready to release v2.1, you would merge it into master (because it should only have changes made since master was last released), then rebase develop from master. Continue to do maint work in the v2 branch and new work toward v3.0 in develop.

Later, when you're ready for v3.0, create a v3 branch from develop, then merge it into master and release. Rebase develop from master again. Now continue doing maint work in the v3 branch and forward-looking v4.0 work in develop.

Make sense?

acrobat commented 10 years ago

Yes this makes sense except that we already use such gitflow only that we have a master (next minor version) and version branch which are always stable and ready for release.

So the git part is clear to me, but the problem is more with the sqitch plan between version, i don't exactly get how manage that. Maybe we are doing something wrong. But merging database bugfixes from 2.0 to master will append those changes (that way test builds for our master branch can be deployed) but when eventually master is released and users from the 2.0 branch will upgrade to 2.1, the database deploy will fail with this kind of error

$ sqitch deploy
Cannot find 46446ea738171102cd9284d471a7607ee6e5bf1a in the plan

//In the example above the hash would be representing change "Feature A"

Because changes from master will be first in the master sqitch plan and bugfixes from 2.0 on the end of the file

I hope you understand the problem a bit better know, i guess it's not related to the git flow but more how the plan file is build. It's quite possible we are using sqitch in a wrong way, but i can't find a "right" way, so it seems :(

tprocter commented 10 years ago

I'm interested in suggestions for this as well. We have this problem fairly often and it usually requires manual correction by editing the plan file (swapping changes around) followed by babysitting the actual deployment, taking changes one at a time and correcting the plan and/or the sqitch database as we go.

I'd like to see a more stable way to manage the plan when more than one version of the application needs to be actively maintained. In this scenario, we can't assume that changes are always being appended to the end of the plan. They might need to be inserted in a previous tag, for example adding a hotfix tag between releases. Again - easy to do on the release branch, but tricky on a master or develpment branch where future changes already exist.

theory commented 10 years ago

I see, the problem is that you can have two versions in production at once. So if you add a chance to your v2 branch and release, and then add the same change to the master branch and release, the change will be in different places, which screws with people when they later want to upgrade from a master release to a v2 release.

I'll have to think about this. The current implementation assumes that changes will always be sequential, but maybe we'll have to add some kind of support for multiple parents, like Git commits support. I'm not at all sure how that should work, though. Someone smarter than me might have to give it some think.

acrobat commented 10 years ago

yes you are right @theory that's the case what is was trying to explain. I will also try to find some more info on how git commits work and how this can be supported in sqitch. That way we hopefully find a decent solution for this problem!

theory commented 10 years ago

It would most likely end up requiring a new plan format to support multiple parents.

acrobat commented 10 years ago

I found this stackoverflow link on the sqitch usergroup, http://stackoverflow.com/questions/13369187/can-order-be-better-preserved-when-using-the-tsort-algorithm

Maybe it's a solution to not have a sequential order in the plan file, but just follow the requirements as they are defined by the user?

theory commented 10 years ago

The reason it uses a git history/BitCoin blockchain-like approach is to guarantee that things are always deployed in the correct order. This has saved us from fucking up production deployments a few times. That is something I don't want to give up.

But if there were a few branches in the chain, that would be okay. I suspect what we'd have to do is add something to the plan to list parents for each change, then update things to look for multiple parents. I don't have a lot of mental bandwidth to give this at the moment, but I suspect it will require quite a bit of re-organization of the plan parser and in-memory representation. It's do-able, though, I have no doubt.

acrobat commented 10 years ago

Yes i see, that way it's indeed not a good idea to drop the "sequential" plan type. Would be nice and hopefully we are able to integrate this change in to the 1.0.0 release!

theory commented 9 years ago

I've been thinking about this off and on for the last few weeks. I think that, if we want to get something done in time for v1.0.0. it wouldn't be a plan format with multiple children from a single change. I have no frigging idea how to go about that at this point.

So I think an interim solution would be to add some kind of --force option to deploy and related commands. It could work in one of a couple of ways:

Just use the change name for the last change to decide where we are in the plan. Do the deploy from there, and if any changes have the same names as those already deployed, skip them. You would then end up with the last change having the proper ID, so subsequent deploys would work from there (assuming the same sort of divergence doesn't happen again). Downsides: The SHA1 chain will be broken in the database, though currently nothing validates that. More importantly, though, how would it tell the difference between something having been deployed already and a new version of it? That is, if you deployed change "foo" to v2 and also to v3, is it the same "foo" or might it be different? This could be worked around by storing a SHA-1 of the deploy script itself, which is something I've thought for a while we ought to do anyway.
The second option could probably be built on the first: In addition to all that, go back and find the last SHA-1 that is is common, and then rewrite the SHA-1s in the database to match the new plan. Again, this would work especially well if we were to store the SHA-1 of each deploy script. Afterwards, the plan could be traced through the IDs in the database again.

So, here's what I propose to do:

Add the deploy script hash to the database.
Add the an upgrade command so that existing databases can be upgraded with the new column (#87).
Add the --force option (or --merge perhaps?) to do do the log rewriting described above, relying on deploy script hashes to trace things properly. This would also require updating tags (when you upgrade from v2 to v3, the "v2" tag on "foo" would need to be replaced with "v3").

Thoughts?

theory commented 9 years ago

I've now implemented the upgrade command and closed #87.

tprocter commented 9 years ago

So here are my thoughts.

Let's take the time to do this one right. Don't rush it for the 1.0 release. In my view, the issues outlined here are the difference between a convenience tool for basement projects and a viable option for production system deployments. In production, when dollars and business continuity are on the line, it's not enough to encourage best-practice workflows. With large-scale projects and multiple developers working on multiple versions of the same code simultaneously. It's inevitable that there will be exceptions to the expected behaviour. We always need a fallback plan for when the assertions fail.
Due to the complexity of the teams and projects that I work with personally, I see sqitch integrity errors break deployments on a daily basis (most of these not in production thankfully).
- To survive this, we've developed a helper tool for sqitch that identifies discrepancies between a sqitch filesystem and a target database. It recognizes changes that have been inserted, deleted or modified on the filesystem with respect to what the database expects, and offers steps to manually correct this (mostly in the form of SQL).
- Sqitch could be doing this itself, similar to the 'force' idea but with a bit more built-in intelligence behind it.
- You never want a repair to be done by default since even with the best of algorithms, the repair could make things worse.
- Don't break support for the verify command. This is a critical piece of improving confidence in the repair, especially when reworks are involved.
- Two commands that would make Sqitch immensely easier to use would be:
- insert - add a new or reworked change between existing tags. I realize some teams may not want to use this in their respective workflows, so maybe have it disabled by default, but for the rest of us it's critical. The only alternative is to edit the sqitch plan by hand and I'll leave it to your imagination what kind of problems that can cause.
- rename - in order to keep the filesystem tidy, and to deal with cases when objects in the database get renamed, there should be a way to do this. Sqitch should be able to track an old and a new name for a given change.
I don't think we need a multi-parent system. Most of the use cases for this could be solved with a solid implementation of the repair command (or we could call it reconcile/resolve/converge/cleanup/...). Much of the errors could also be avoided if we continue with the git metaphor and add "staging" support.
- Changes could be collected in a staging area (like a mini-plan), but only moved into the plan at a chosen time.
- The final ordering of changes isn't determined until they are moved into the plan (users can select which changes and when to move them from staging to final). This duty might be given to someone with a release manager role.
- This would allow developers to work on feature branches that might be worked on across sqitch tags, which is currently very painful and error-prone to merge. It would also reduce the frequency that we get a git conflict on the plan file itself which currently happens on almost every merge.
- We might also consider multiple named staging areas if users want to work in fully isolated spaces, but that might be more trouble than it's worth.

Ovid commented 9 years ago

In production, when dollars and business continuity are on the line, it's not enough to encourage best-practice workflows. With large-scale projects and multiple developers working on multiple versions of the same code simultaneously. It's inevitable that there will be exceptions to the expected behaviour.

Completely agreed. My current client is hiring plenty of devs and not all of them take the time to read the documentation or, if they do, they don't always understand it. While it's good to have well-designed software that offers and affordance to do the right thing, in reality, people will always find ways of doing unexpected things.

theory commented 9 years ago

To survive this, we've developed a helper tool for sqitch that identifies discrepancies between a sqitch filesystem and a target database. It recognizes changes that have been inserted, deleted or modified on the filesystem with respect to what the database expects, and offers steps to manually correct this (mostly in the form of SQL).

I would be very curious to see what this code looks like. This is the purpose behind my plan to add deploy script hashes to the database, essentially as a second identifier that derives from the script itself, rather than the plan.

theory commented 9 years ago

insert - add a new or reworked change between existing tags. I realize some teams may not want to use this in their respective workflows, so maybe have it disabled by default, but for the rest of us it's critical. The only alternative is to edit the sqitch plan by hand and I'll leave it to your imagination what kind of problems that can cause.

Maybe add options to the add command, like --before foo, or even --before foo^ or --after @bar. I'm not sure how this would affect deployments, though. If you have already deployed foo, then add a change before it, how should the deployment work?

rename - in order to keep the filesystem tidy, and to deal with cases when objects in the database get renamed, there should be a way to do this. Sqitch should be able to track an old and a new name for a given change.

Hrm. That's quite a lot of information for Sqitch to have to track, and the plan format is kept deliberately simple. Right now Sqitch does't know any names at all, it's all in the plan.

What about something like sqitch rebase --force --log-only? That would update the registry of a deployed database to reflect exactly whats in the plan, without running any of the deployment scripts.

theory commented 9 years ago

Much of the errors could also be avoided if we continue with the git metaphor and add "staging" support.

Changes could be collected in a staging area (like a mini-plan), but only moved into the plan at a chosen time.

The final ordering of changes isn't determined until they are moved into the plan (users can select which changes and when to move them from staging to final). This duty might be given to someone with a release manager role.

This is already something that can very much be done with SCM branches, and a single party responsible for merging.

This would allow developers to work on feature branches that might be worked on across sqitch tags, which is currently very painful and error-prone to merge. It would also reduce the frequency that we get a git conflict on the plan file itself which currently happens on almost every merge.

Have you tried using union merges? I discussed some patterns with them to simplify merges in the tutorial.

We might also consider multiple named staging areas if users want to work in fully isolated spaces, but that might be more trouble than it's worth.

Well, it might be interesting to allow a single project to have multiple plan files…

theory commented 9 years ago

Completely agreed. My current client is hiring plenty of devs and not all of them take the time to read the documentation or, if they do, they don't always understand it. While it's good to have well-designed software that offers and affordance to do the right thing, in reality, people will always find ways of doing unexpected things.

Yeah, the point of this issue has become, I think, "How do we add tools to correct for unexpected conflicts?" Examples include inserting change before tags, changing the plan order, and having multiple releases (and therefore tags) with the same changes. I think we can draw some inspiration from git push --force, git filter-branch, and the like.

theory commented 9 years ago

Okay, with these commits, I've added support for logging the deploy script hash along with each change. With that in place, I propose a first pass at reconciling differences between a target registry and a plan as follows:

Add a new option to deploy: --merge. Without this flag, deploys happen just as they do now. But with it, we try to merge a plan into the registry by taking the following steps:

Find the latest change ID in common between the registry and the plan. If there are none, we will try to merge them all.
For all changes deployed after that change, try to find the corresponding change in play by comparing script hashes. This will only work if there have been no changes to the deploy script for a given change.
If a deployed change cannot be found in the plan by script hash, try looking by name. This will only work for non-reworked changes, because reworked changes have multiple instances with the same name.
If a change cannot be found by script hash or name in the plan, abort the deploy. We cannot do a merge this way. (Later on, perhaps we'll add some sort of --force option to take more extreme [and potentially destructive] steps).
If all such changes can be found in the plan, put together an in-memory "merge plan" by examining all changes in the plan since the common change ID. Evaluate them as follows:
- If the change is not already deployed, plan to deploy it as usual.
- If a change is already deployed but with a different ID, plan to simply update all attributes other than the commit information (we want to preserve when it was deployed and by whom, even if we change its name and ID). Tags should also be updated to be only those in the plan.
- Perform the usual pre-deploy check for conflicts or other issues.
If it all checks out, then deploy the merge plan. If a change is merged, rather than deploy, log it in the events table.

This approach should solve the original challenge, where multiple releases of the plan can have different tags and use the same changes. Assuming those changes have identical deploy files or have the same name and are un-reworked, we should be able to merge things such that everything is in the proper order.

The one catch I can think of is if the last change in the plan is not the last change in the registry (ordered by commit date). In that case, we would need to update the commit date for that one change.

Thoughts?

acrobat commented 9 years ago

By the looks of how the merge option works it should fix this issue or atleast in the case where we are at. The plan file is just a bit out of order but the content of each file referenced in the plan file is exact over all branches!

So I think we are good to go with this fix!

theory commented 9 years ago

Excellent, thanks for the feedback, @acrobat!

theory commented 9 years ago

In expectation of the proposed work here, In da8f24e5 I went ahead and added support for "merge" events to be logged. None are logged yet, but I wanted to get all the schema changes done before a release, which I expect to do soon.

acrobat commented 9 years ago

Looks good @theory!

decibel commented 8 years ago

I'm not sure if this is still an open item, but here's a few thoughts about it:

I think it would be useful to have a way to scream loudly about any changes that happen in a dev environment that might not be able to deploy in prod. In order to do that, you'd need some way to publish the current state of multiple production environments, and have sqitch be able to grab that info. I'm not sure git would be the right tool for that, though there is an interesting possibility of specifying a number of git branches/tags as being "production", and looking at the sqitch.plan in each of those. But that still depends on each dev keeping git in sync, so probably better to also support pulling the info from some other source.
Right now there's still too much need to hand-edit the plan when you screw something up. #290 and #295 start to address that, but at a minimum you'd still need to be able to change dependencies of something you're working on, and possibly re-ordering things (though, ideally the order would be determined only by dependencies...)
Once there's no more reason to hand-edit the plan the door opens for having a plan that's much more like a log of actions instead of a static snapshot. That allows for the concept of "hey, just replay these actions". I suspect that's what's needed to truly support things like renames that will also work with deployed changes.
When it comes to renaming objects, we could have specific rename templates. For that matter, maybe it's time to have object-specific templates (ie: separate templates for tables, triggers, views, etc). It seems that general best practice is to give every object it's own file anyway (so that you can use sqitch rework), so this would make that easier.

All of those sound pretty non-trivial. Maybe they're worth doing; maybe not... but if they're not done then it would be good to have specific documentation on how to handle some of these scenarios. (Maybe some of that already exists; I'm not deep enough into the Koolaid yet to have encountered most of these problems...)

theory commented 8 years ago

FWIW, you can create object-specific templates yourself, as detailed in this blog post.

sqitchers / sqitch

Add support for merging a registry with a modified plan #200