Open jayvdb opened 1 year ago
You make good points.
unmake
command in the cli would be low-hanging fruit to improve the situation.Butane's migration system is loosely inspired by Django's, although the latter's is obviously far more mature.
The "unmake" approach is probably short-sighted. In production scenarios, often the DDL/DML to migrate needs to be hand-coded. The current butane system allows for this, as it has up/down.sql which can be edited.
IMO, we don't want to be deleting hand-coded migrations, as that is a feature needed for butane to be production-ready. reshape
(https://github.com/Electron100/butane/issues/67) provides a way to avoid hand-coding DML, but that would be an optional butane feature, and I am not 100% sure reshape
can entirely avoid the need to add custom DDL/DML to migrations. The Django migration system allows for custom Python code to be used for each migration, and I have needed to use that so frequently that I doubt a purely declarative migration system is sufficient.
Another approach might be give the user the ability to perform the following sequence
.butane/migrations/state.json
to the prior state.embed
git rebase
."rejoin" a specified migration after they have rebased their branch. This would:
.table
files in the migration, and report any which were modified - this helps the developer see what has changed since the migration was written, which gives them a huge clue bat as to whether the migration being rebased still makes sense.--force
was given, rename the specified migration to use the current timestamp, write the new files, etc, etc.When there is an existing detached migration, all commands would be disabled except for the "rejoin", with an error message informing the user how to delete the detached migration if they want to abandon it.
One of the items in https://github.com/Electron100/butane/blob/877b65f/notes.org is "Remove timestamp from migration names". This refers to the directory name for the migration which appears in
.butane/migrations/
, and are named likeYYMMDD_<seconds>_<migration_name>
.I don't know all of what might be intended by that org entry, but one reason for removing timestamp from the migration name is that it problematic when it comes to changing the order of pending migrations.
i.e. two PRs are submitted that both include a migration. The date is in the migration names. For the purposes of simplifying the explanation, assume PR "A" is created on Monday, PR "B" is created and merged on Tuesday to
main
without modification, and the PR "A" is updated and merged tomain
on Wednesday.The first obvious problem is the date of the PR "A" will be before PR "B", despite the real order being the other way around. The date a migration was created is not helpful from the perspective of a migration. The order that matters is which order the migration needs to be applied in.
The more difficult problem is that updating PR "A" is hard. When PR "B" is merged, it will cause conflicts in PR "A". This is good at present, because it prevents accidentally merging PR "A" without first rebasing it on top of PR "B".
And all of this is a very basic situation - it gets much harder if there are multiple branches, such as supporting multiple versions, and various patches being back-ported to older versions in all unpredictable order.
The approach I am using is to
git checkout main .butane/migrations
, thenbutane makemigration <name>
to recreate the migrationThis process means the date of the PR "A" migration will be after the date of PR "B", which makes the date prefix sensible again.
Can we improve this? At the very least we can create a
butane_cli
command that "unmake" the last migration (delete it and rollback.butane/migrations/state.json
to prior migration name), which is basically step (1). Theclean
command deletes all of.butane/migrations/current/
, but not the existing migrations, which is step (2).The prior migration name is available in the
from_name
in theinfo.json
in each migration.I've spent a lot of time with Django, and it is seen as quite good. We should seriously consider their migration approach unless it can be improved upon. https://docs.djangoproject.com/en/4.2/topics/migrations/
Django default a serial number as the prefix for migration names, but does - by default - include the date in the middle of the migration name. This reduces the importance of the date. The serial number is just as bad as a date, as it is common to have migrations merged out of order.
There are Django apps which help with rebasing, including renumbering migrations. IMO we can learn from those and we should try to get that type of functionality built in.
There are other migration-specific tools that do better than typical ORMs, so we can learn at lot from them. c.f. https://github.com/Electron100/butane/issues/67