theforeman / puppet-pulpcore

Puppet module for setting up Pulp 3 as part of Katello installation
GNU General Public License v3.0
2 stars 28 forks source link

Always run pulpcore-manager migrate #351

Closed evgeni closed 1 month ago

evgeni commented 1 month ago

Pulpcore uses post migration hooks to create data in the database, which sometimes changes even if there are no real migrations to be run.

This is fine from a Django PoV -- post migration hooks run after every migrate call, even if there have been no changes.

However, this means that we need to run migrate unconditionally as otherwise we might miss changes to the hooks (or their data) when there is no migration attached.

An example of such a change is https://github.com/pulp/pulpcore/commit/20cffca49de5f3bfac44e0160c1f0fb262141ea5 where a new role was introduced and our setup started failing as it was expecting the role to exist after the upgrade, but it did not as we did not call migrate.

evgeni commented 1 month ago

This makes our module non-idempotent so I'm not sure what to do here. It feels to me that they broke --check and that would be the correct fix.

Technically correct. But p-m migrate is already idempotent in itself, so there is not much winning here to double it via Puppet.

It reminds me of Foreman's db:seed, which is also ugly.

Yeah.

ekohl commented 1 month ago

From the Puppet perspective it looks really hard to make it idempotent so I wanted to understand what was going on.

This is the code that actually triggers the population: https://github.com/pulp/pulpcore/blob/fa1931d780d9b1b1e409e387add38e42d8ec28b8/pulpcore/app/apps.py#L122-L136

Reading the post_migrate documentation there's an interesting note:

Sent at the end of the migrate (even if no migrations are run) and flush commands. It’s not emitted for applications that lack a models module.

Another thing that's interesting:

sender An AppConfig instance for the application that was just installed.

Or is that the AppConfig for the file it's imported from (in this case PulpAppConfig)? It can't be the various other apps.

I had always assumed people in Django used data migrations to ensure records were present. At this point I'm not sure how to best proceed. I'm a bit tempted to at least implement a parameter to toggle the behavior so users can make it idempotent.

evgeni commented 1 month ago

I had always assumed people in Django used data migrations to ensure records were present.

For a general "in Django", this is correct: https://docs.djangoproject.com/en/5.0/topics/migrations/#data-migrations

However, Pulp does not allow backports of migrations (https://github.com/pulp/pulpcore/pull/952) and in the particular case of data migrations flush kills the data but does not re-apply the migration afterwards (as it's marked as run): https://pulp.plan.io/issues/7710

At this point I'm not sure how to best proceed. I'm a bit tempted to at least implement a parameter to toggle the behavior so users can make it idempotent.

You mean a pulpcore::database::always_run_migrations that is true by default but CI and some users can set to false?

Edit: mhh, pulpcore::database is marked as private API, should this boolean be pulpcore::always_run_database_migrations? Or should we just not care to make this officially public?

evgeni commented 1 month ago

:green_apple:

evgeni commented 1 month ago

Some further consideration: making it non-idempotent will always restart Pulp services, meaning the module becomes really unreliable for people who use them as Puppet was intended (constantly running).

What makes you think it will restart pulp services? I see nothing that would notify the service objects after a migration has happened?