This PR is a POC implementation and part of an ongoing discussion about how we should handle data migrations for one-off static fixtures.
I had 3 approaching this that I tried.
Approach 1 (pr): Add fixtures in as a data migration that runs inline with regular migration process
data fixtures are introduced through standard migration process. We can deploy and forget about it once migration is run
fixture data evolves along with the schema moving forward (no need to maintain the fixture schema)
satisfies both local and prod environments
potential issues:
data from the migrations are also introduced into the test environment (since tests run all migrations). A lot of our tests assume empty states and dont setup/teardown to account for data introduced via migrations
Approach 2 (branch): create a separate app for handling migrations and cross reference dependent migrations in other apps:
migrations would look the same and run as part of regular migrations. the difference is that it would sit in a separate app
potential issues:
There are many strange edge cases we can get into and introduces more issues and complexity than it resolves. for example, to illustrate one case:
testimonials -> runs set of migrations to get into a state (0005_rename_avatar_attestation_avatar_test.py)
data_fixtures app creates data migration 0001_load_testimonials that loads testimonials once and has 0005_rename_avatar_attestation_avatar_test.py listed as a dependency
someone renames a field on testimonials and runs makemigrations - a new migration is generated in testimonials app that also has a dependency on 0005_rename_avatar_attestation_avatar_test.py and is unaware of the data migration in data_fixtures app - there is ambiguity on which migration should get executed first - running manage.py migrate in shell succeeds but migrations fail in testcases
if we need to reverse data migrations that also becomes an issue. for example, in the above case, suppose we were to layer on several more migrations in testimonials that alter fields, attempting to reverse the data migration by itself will fail since the fields in testimonials have progressed and the dependency graph as broken.
Approach 3 (this branch): create a separate app for handling migrations and use a flag to control when the app's migrations run and make sure it always runs last
do not need to cross reference other apps as dependencies (although it can when needed)
guaranteed to always run last and so fixtures must always be updated to match the latest schema
avoids the issue of introducing fixture data into tests
can still be run in test pipeline (we can manually execute after tests run)
How this is setup:
We check for an environment variable "RUN_DATA_MIGRATIONS" - by default it is not set and we use MIGRATION_MODULES in settings.py to disable running migrations in the fixture app
The CI and deploy scripts will manually execute the migrations at the end via RUN_DATA_MIGRATIONS=true python manage.py migrate
potential issues:
we are breaking away from conventional/documented data migration pattern of introducing them inline with other migrations
There is a maintenance overhead - since the data being loaded does not evolve with the schema, we will need to manually keep them up to date if models change.
What are the relevant tickets?
Closes https://github.com/mitodl/hq/issues/4687
Description (What does it do?)
This PR is a POC implementation and part of an ongoing discussion about how we should handle data migrations for one-off static fixtures.
I had 3 approaching this that I tried.
Approach 1 (pr): Add fixtures in as a data migration that runs inline with regular migration process
potential issues:
Approach 2 (branch): create a separate app for handling migrations and cross reference dependent migrations in other apps:
potential issues:
makemigrations
- a new migration is generated in testimonials app that also has a dependency on 0005_rename_avatar_attestation_avatar_test.py and is unaware of the data migration in data_fixtures app - there is ambiguity on which migration should get executed first - running manage.py migrate in shell succeeds but migrations fail in testcasesApproach 3 (this branch): create a separate app for handling migrations and use a flag to control when the app's migrations run and make sure it always runs last
How this is setup:
RUN_DATA_MIGRATIONS=true python manage.py migrate
potential issues: