yogthos / migratus

MIGRATE ALL THE THINGS!
648 stars 95 forks source link

Question: how should sensitive data be handled in migratus? #264

Open WorldsEndless opened 4 months ago

WorldsEndless commented 4 months ago

Migratus has been a great tool for initial design of our schemas and tables. In production, though, sensitive data enters the application. Encoding these as "up" migrations of insertions is desirable; it helps the developers work against real data. But how should that be saved? Our previous solution has been to make sure the project is private. For the time being, this refers to Github, and the migrations are just one folder along with our application code.

Searching, I find surprisingly little addressing the concern of storing sensitive data (names, phone numbers, requests and trouble-tickets) in database migration files. Surely this is a common encounter. How should I handle db migrations that include sensitive information in a secure but developer-accessible manner?

yogthos commented 4 months ago

One approach for doing this would be to use something like Selmer templates for the migrations and inject sensitive values during the build step in CI. So, you could have something like reosurces/templates/migrations, and then add a build task that compiles them into the actual migration files that are written into resources/migrations.

ieugen commented 4 months ago

I would not put sensitive data as migrations. Migrations are for schema changes primarily, not for data changes. If you do it for data changes I would see that as a red flag. You could make data changes but mainly when schema changes and you need to split / merge add default values.

Have an import process for data and manage that as operational changes, not schema migrations. Import / export feature in the app.

yogthos commented 4 months ago

I agree with that as well, as a rule it's best to keep data and app separate. My approach has been to keep the jar environment agnostic, and then inject any environment specific data such as db connections, access tokens, etc. as environment variables into containers during deployment. Then the jar starts up and picks up the environment at runtime.