Closed loilo closed 4 years ago
I like the idea in general, binding it to the Manager plugin. Three thoughts:
The terms "source version" and "target version" used below refer to "the package in the version that is currently installed" vs "the package in the version that we want to migrate to".
Sure. To be honest, I'm not proficient enough in the ecosystem to be aware of existing migration solutions.
(Immutably) versioning migrations would allow us — with knowledge of source and target version numbers — to only execute the migrations necessary to get from one version to the other.
Without that mapping, I'm not sure how we'd detect which migrations to execute:
Roll back all migrations of the source version and then execute all migrations of the target version.
I don't think this would be a good approach since (depending on the kind of migrations performed) it can create a loss of data.
Diff the migration lists of source and target versions and execute what's needed.
I would be all-in for that approach if it's technically viable. I haven't considered it so far because It would require access to the target version's migrations list before the source version is removed.
Or did I miss another approach to achieve this?
I'm not sure I'm understanding your question correctly. Assuming it's about "how to roll back migrations if the list of migrations has just been erased from the harddisk": We should be able to execute lifecycle hooks just before deletion via Composer's pre-package-uninstall
hook, shouldn't we?
(Immutably) versioning migrations would allow us — with knowledge of source and target version numbers — to only execute the migrations necessary to get from one version to the other.
Doctrine Migrations takes care of that by storing the already executed migrations in the database. So the database knows in which state it is. Doctrine Migrations does not care about version numbers or anything similar, it just stores the class name (sort of). By default it generates a class name with the current timestamp, anything as long as it is unique (which is more of a problem when using bundles).
We should be able to execute lifecycle hooks just before deletion via Composer's pre-package-uninstall hook,
Good point, maybe we can do that.
Okay, if we can get Doctrine Migrations to handle rollbacks/updates on a per-bundle basis, that sounds very reasonable.
The only problem I currently see is that Doctrine Migrations are run inside the Contao application. And composer events (package install/update/remove) are not. But I guess we'll find a solution for that.
BTW, it was not integrated in Doctrine for many reasons. Try searching for "bundle doctrine migrations" :-) But I guess most of the downsides do not apply to us. For regular doctrine migrations, you have to execute doctrine:migrations:migrate
so I think it would be a good idea to just use the same process. Like contao:migrate
that internally asks all the bundles in our order and then at the end the application level ones. I don't think it's a good idea to try to hook into the doctrine process. I think it's better to provide a standalone alternative that uses doctrine migrations under the hood.
i dont think doctrine migrations is suited as a runonce replacement.
runonce
files that essentially are just migration files from one version to another. They include a check for the current version or state of the module and a resulting action (or no action). In that regard, idempotency of runonce
files is only as good as their implementation.
That's why I don't think there's a significant difference in robustness in the way module authors implement migrations. But we may have more robustness by completely discouraging developers to create any such irreversable changes by just not providing a way for reversion — that's a very valid point.However, both of these points are why I explicitely asked for feedback on the current usage of developers' runonce.php
files. It will be important to know the use cases before making a move into any direction.
Idempotency is a debateable point. Browsing GitHub, I've seen a non-zero amount of runonce files that essentially are just migration files from one version to another. They include a check for the current version or state of the module and a resulting action (or no action). In that regard, idempotency of runonce files is only as good as their implementation. That's why I don't think there's a significant difference in robustness in the way module authors implement migrations. But we may have more robustness by completely discouraging developers to create any such irreversable changes by just not providing a way for reversion — that's a very valid point.
idempotency has nothing to do with reversibility. it just means that i can apply the same runonce over and over again, without changing the (initial result). when we use a little bit of math to describe this, where f is our runonce and X is the initial state and Y the final state: Y = f(X) = f(f(X)) = f(f(f(X))) = f^n(X); for all n of N
That is an implicit requirement of the current runonce implementation and it eases runonce handling, because you can just drop in the runonce again, if the previous process ran into some race conditions or did abort, or if you reinstall the extension. also there is no need to track which runonces are already executed, since you can just re-execute all runonce in the correct order and get the same final state as when you would just execute the "new" ones.
I know what idempotency is, I just communicated my point poorly – sorry for that.
What I tried to say is: idempotency is given for runonce
scripts because of the way they are implemented.
A theoretical example: a runonce
script should add an item to a list.
The algorithm will be like:
The script is idempotent, but it only is because we do a check in step 1. In other words, idempotency depends on the quality of the implementation.
What I wanted to point out is that, vice versa, if a module's migration script in a properly designed migration system is not idempotent, it's because of poor implementation on the module author's side — a factor that should apply to runonce
files and migration scripts in the same way.
Or am I completely missing something important?
i agree that it is an error to not fulfill the (implicitly) required idempotency, when writing runonce scripts.
also if an extension has 2 runonces f
(first) and g
(second) with X
being the initial state and Y
the final state, then to fulfill idempotency the following MAY NOT hold: Y = g(f(X)) <-/-> Y = g(X)
. that means you still need to apply f
at least once before applying g
. this is ensured by all composer driven runonce implementation ("ER driven" runonce implementation dont support multiple runonce), because they use the definition order of runonce files (in the composer.json) and execute them in order. with other words: all runonce files are treated as one runonce that is the result of concatenating the single runonce files in definition order: a = f ◦ g
but i still dont get your point? bad code is bad, like said, but that doesnt mean we shouldnt require properties that improve UX (UX != DX).
btw. the point of using database migrations utilities like doctrine migrations (there are better migration tools) is that you dont need idempotency, because you version your db schema and persist the schema version inside the db itself, so the "should i run this part of the migration or not?" is answered by the stored version: a "single" number (actually a set of numbers) that reflects the whole db state in a sense.
btw. the point of using database migrations utilities like doctrine migrations (there are better migration tools) is that you dont need idempotency, because you version your db schema and persist the schema version inside the db itself
Right, but the resulting execution run by the migration framework still is idempotent, that's the framework's job.
That's basically my question boiled donw: How is it qualitatively different whether idempotency is enforced through a runonce
script itself or through the underlying migration system?
probably no difference, but this is just a naive guess and i wouldnt count on that. but why does it matter? its 1 point on my list, that, when i did put it on, wasnt sure (and still arent), if it is a property fulfilled by doctrine migrations.
I agree that Doctrine Migrations has it's flaws and was not made for this. However, DM is idempotent in the sense that it tracks what has already run, which eliminates the issue of a developer correctly tracking what has already been done.
In regards to the database field updates, I never had any issues with that by just using preUp
, up
and postUp
. Running out of memory is not really a concern for me, because that will happen in whatever implementation one would use.
@backbone87 It matters because to discuss the point I needed to understand it first, that's why I wanted to make sure I'm getting your argument right.
I haven't commented on your other points because I don't disagree or need clarification on them. Using Doctrine Migrations has neither been part of my proposal nor do I have the experience to defend or disqualify it as a tool.
As discussed in Mumble on September 27th, we want to add a migration interface and a command to run the migrations. This is not to be mistaken with Doctrine migrations.
interface ContaoMigration
{
public function shouldBeRun(): bool {}
public function getDescription(): string {}
public function run(): void {}
public function getRunAfter(): array {}
}
Sounds nice. Some questions though:
getRunAfter()
should return/when it would be executed?Just a few more questions, for specification:
shouldBeRun
? Like Contao tracking already executed migrations?And some more:
shouldBeRun
true? Does the user have restart/reconfirm the migration process?Looks like you guys have thought about this topic already, why don‘t you work on something? That would be great!
According to our discussion on Mumble (no definitive answers):
I'd be curious what getRunAfter() should return/when it would be executed?
The classes this migration should run after.
So if I'm getting this right, no mechanism for downgrading / removal is planned?
Yes. Downgrade is hard to impossible to achieve if packages get removed the way they are removed now (composer). Earlier versions would have to know how to deal with future changes.
Are Migrations Services? (aka do i can inject services into them) Can there be multiple Migrations per Module/Bundle?
Yes. Just use multiple classes with that interface.
Do Migrations run before or after the DCA-driven DB-schema update? Maybe an option to run before and after?
The contao install tool updates would use the same mechanism. So you could e.g. specify if yours should run after them.
Are there other preconditions except shouldBeRun? Like Contao tracking already executed migrations?
No.
Is the description displayed to the user before the migration is executed?
This would allow a dry-run option, where shouldBeRun is executed and a list of those returning true + their descriptions is shown.
Is the user able to skip specific or all migrations? (My opinion: he should NOT be able to)
Yeah, I don't think so.
What if running migration A makes migration B's shouldBeRun true? Does the user have restart/reconfirm the migration process?
Yes. Alternatively we could execute the whole thing until all return false ;-).
What if this results in loops?
Probably we shouldn't do what I just said then. :-) This won't happen that often, I guess or do you have an example?
Yes. Downgrade is hard to impossible to achieve if packages get removed the way they are removed now (composer). Earlier versions would have to know how to deal with future changes.
Aw, snap. Well, I had thought out a way for that in the original proposal (opening post of this issue), I have to admit though that it's probably uncannily fragile.
I'd be curious what getRunAfter() should return/when it would be executed?
The classes this migration should run after.
What if, i have a generic MigrateFieldMigration
, that is added as a service multiple times (but with different params)? Wouldnt it be better to return the service IDs insteadof class name? If you use autowiring and service defaults, this is then also the class name. (because services from service defaults become class name as service name).
I guess the services are tagged with contao.migration
anyway. Wouldnt it make more sense to pass the runAfter, via tag property? (similar to various priority props used with already existing container tags)
Yes. Alternatively we could execute the whole thing until all return false ;-).
But then you can not display the description beforehand, because you dont know the migrations the should be run after you executed a set of migrations.
Probably we shouldn't do what I just said then. :-)
A solution would be to remember all service IDs of the migrations already ran and only execute them once in a migrate step. But that still does not fix the description problem.
This won't happen that often, I guess or do you have an example?
In most cases it would be a dev mistake or out-of-scope usage, but imagine 2 different extension trying to modify a core field (slightly, e.g. changing collation or length). The shouldBeRun
checks for current field length/collection matching the updated one.
I guess the services are tagged with contao.migration anyway. Wouldnt it make more sense to pass the runAfter, via tag property? (similar to various priority props used with already existing container tags)
Hmm. Right now I think I'd rather not clutter the container with that and have a solution analogous to the manager plugins.
In most cases it would be a dev mistake or out-of-scope usage, but imagine 2 different extension trying to modify a core field (slightly, e.g. changing collation or length). The shouldBeRun checks for current field length/collection matching the updated one.
True, but then these two extensions would likely be incompatible anyways (or should enforce a more relaxed constraint like >= / <= instead of ==).
Alternatively we could execute the whole thing until all return false ;-).
But then you can not display the description beforehand, because you dont know the migrations the should be run after you executed a set of migrations.
Yeah, that's why it would probably be better to only run it once (or use a command flag if we want to run more than once, like -e or -e=4 or whatever).
Most importantly this IMHO should be a slim solution + interface that does not try to cover all migration cases (and is not overengineered) but provides a sufficient replacement for the runonce.php
files. We can still improve the command later on and e.g. add options or allow more detailed config.
Most importantly this IMHO should be a slim solution + interface that does not try to cover all migration cases (and is not overengineered) but provides a sufficient replacement for the
runonce.php
files.
This would probably require the initial iteration to not offer shouldBeRun()
at all, but again track executed migrations on Contao's side (which we wanted to avoid IIRC?).
Otherwise, we'd either have to think out shouldBeRun()
's behaviour from the start or introduce breaking changes later.
I'm not saying we should not discuss the behaviour? I just want the whole thing to stay managable which means: small interfaces, few features to begin with. Small doesn't mean empty 😉.
Without shouldBeRun()
an event would probably be suited better. This was discussed as well. But then e.g. everyone would need to replicate the logic / there could not be a dry run (btw. something that is unfortunately missing in the install tool). I don't think the application should track installs - there is way too much that can happen (from manual DB migration, installing - uinstalling - installing another version - ... + side effects from other extensions that operate on the same fields).
Plus: The core's migrations actually use a similar interface right now and could easily be adapted.
Hmm. Right now I think I'd rather not clutter the container with that and have a solution analogous to the manager plugins.
What do you mean with clutter? The tag should be there anyway? And the container is responsible for providing a collection of migrations (via tag). A single migration wouldnt know about other migrations, only the container.
As discussed at the developer meeting, we want to use tagged services (contao.migration
) with the following interface:
namespace Contao\CoreBundle\Migration;
interface MigrationInterface
{
public function getName(): string {}
public function shouldRun(): bool {}
public function run(): void {}
}
Instead of getRunAfter
we want to use priorities to manage the order of migrations.
The migrations can be executed with a contao:migrate
command which runs in three steps:
shouldRun()
and execute them after confirmation (yes/no
).yes/complete/no
) where yes
doesn’t delete columns and tables.An AbstractContaoMigration
class can default getDescription()
to the name of the class like “installation-bundle Version480Update”.
The install tool should work the same as the command. Once the install-tool gets removed the manager can use the contao:migrate
command.
So, since contao/contao#709 has landed, we can close this one I guess. 🚀
The Future of
runonce.php
In the process of transitioning away from classical Contao Modules to Contao Manager Plugins, there should be a discussion about what will happen to the good old module installation hook — the
runonce.php
.So this is my take on the topic.
What is the gist of the
runonce.php
?Having a
runonce.php
is — in Composer terms — apost-package-install
/post-package-update
script that is executed on a per-package basis. It's all about lifecycle handling.Do we want to extend the support of lifecycle hooks?
I've personally been missing a feature contrary to the
runonce.php
: a script that will be executed after uninstallation, to do cleanup work for changes therunonce.php
may have introduced.I've briefly discussed this option with @ausi but we didn't quite agree on the question if this is actually a thing Contao should have. For me that would've been great have from time to time, but I've used Contao modules for a bunch of rather "meta" work (for example managing git hooks) with side effects beyond the things Contao modules usually do.
Since one of the Managed Edition's goals is to take away a bunch of Symfony's very explicit nature — and thus to make plugin management more user friendly — I think a way to do cleanup on package uninstallation would be useful.
Module updates from one version to another have always been achievable by some clever code in the
runonce.php
. However, that's not the greatest of developer ergonomics.Solution
Design
I've put some thought into how lifecycle hooks could be implemented and came to the conclusion that an incremental migration approach (as used e.g. in Laravel) may be the best bet.
It could be achieved by a new
PluginInterface
— let's call itMigrationPluginInterface
.The interface has one method,
getMigrations()
. It should return an associative array mapping semantic versions to classes implementing theMigrationInterface
:The
MigrationInterface
itself is inspired by Laravel's Database Migrations and requires implementation of theup()
anddown()
methods, performing the migration and a corresponding rollback:Naturally, to work as expected, all
semver => MigrationInterface
mappings would have to be immutable once published. It would be on the developers to follow that rule since I don't see a way to enforce this, but I think that's okay.Execution
After installation of a Contao Plugin implementing the
MigrationPluginInterface
, all its migrations will be executed in order of their versions.After updating a Contao Plugin, only its migrations for versions greater than than the previously installed one are executed.
When uninstalling a Contao Plugin, all its migrations are rolled back.
Feedback
Of course I'm curious to hear your opinions regarding the topic and the suggested solution.
I'd especially be interested to hear how your modules use the
runonce.php
and if you'd ever been in need of a cleanup hook.