manyfold3d / manyfold

A self-hosted digital asset manager for 3d print files.
https://manyfold.app
MIT License
822 stars 49 forks source link

Make data migration more robust to stop deployment failures #2982

Open Floppy opened 1 month ago

Floppy commented 1 month ago

The database schema gets added to over time with migrations, and sometimes we have to change the contents as well, for which we have separate data migrations.

Only problem is, those data migrations run with the current code , NOT the code that was there at the time the migration was written, so they slowly drift out of date, and older ones start to break as the code changes.

There are ways around it, but because of the way this app is distributed, and the way that anyone could be upgrading from any version to any other, it's tricky.

For instance, there's one which changes an old "admin" database column into more flexible administrator role. The data migration has to happen after we create roles, but before we remove the column; but the code is from NOW, not THEN, so if there's a change in validations, or something like that... 💥

If we were in charge of deployment, it would be fine, we could run manual tasks etc. But because it's released as a docker container, and users could be upgrading from any version to any other, and the process has to be automated, it's really hard.

... but there must be a solution.

Ideally all data migrations should be removed, and replaced with code that runs on app start instead (although not in normal initializers, that happens when migrations are run)

😵

Errors that have been caused by this problem:

tr1plus commented 1 month ago

Does this mean that upgrading from a (rather) old version to current would cause the db migration to just never complete/work?

I'm fixing up my docker upgrade mechanism and realised this container just never got the TLC it required. I'm not stuck with a container that just keeps trying to do the data migration. I still have the option to just wipe the DB & start from scratch, but want to check if this could indeed be caused by this?

Floppy commented 1 month ago

Yes, this is what's causing that upgrade problem. If you have a specific message that's failing, open another issue with it and I can look at the detail; I have just released some fixes in v0.84.0 as well.

(sorry for the edit, I misread your comment and answered a completely different question at first :) )

tr1plus commented 1 month ago

All good.

I currently migrated to 0.84 and getting an error. I can't tell you which version I started from unfortunately but probably a few months ago at least.

Don't want to hijack this issue, but just as an FYI. I might rebuild from scratch unless you want to actively look into this. I just use this to "scan" my 3D models I use for 3D printing so I can just rescan whatever I have in the folders.:

StandardError: An error has occurred, this and all later migrations canceled: (StandardError)

undefined local variable or method `public_id' for an instance of ModelFile

---------------xxxxxx---------------

Caused by:
NameError: undefined local variable or method `public_id' for an instance of ModelFile (NameError)
Did you mean?  public_send
tr1plus commented 1 month ago

The database schema gets added to over time with migrations, and sometimes we have to change the contents as well, for which we have separate data migrations.

Only problem is, those data migrations run with the current code , NOT the code that was there at the time the migration was written, so they slowly drift out of date, and older ones start to break as the code changes.

There are ways around it, but because of the way this app is distributed, and the way that anyone could be upgrading from any version to any other, it's tricky.

For instance, there's one which changes an old "admin" database column into more flexible administrator role. The data migration has to happen after we create roles, but before we remove the column; but the code is from NOW, not THEN, so if there's a change in validations, or something like that... 💥

If we were in charge of deployment, it would be fine, we could run manual tasks etc. But because it's released as a docker container, and users could be upgrading from any version to any other, and the process has to be automated, it's really hard.

... but there must be a solution.

Ideally all data migrations should be removed, and replaced with code that runs on app start instead (although not in normal initializers, that happens when migrations are run)

😵

FYI - just thinking out loud.

Can you look at the android migration approach: https://developer.android.com/training/data-storage/room/migrating-db-versions

It basically keeps a "version" number in the database, and runs all other migrations sequentially after the version in DB (& upgrades respective version at each point)

Floppy commented 1 month ago

Thanks! Rails does this as well, and we have separate migrations for schema and data; the problem is that the code changes too, and when it runs an older migration, it's not running the older code as well. So, sometimes they don't match.

tr1plus commented 1 month ago

I might be miss understanding (as I have zero experience with Rails and this size of complexity) - but shouldn't you just do a sequential upgrade of your database structure & execute the migration of data between tables/fields sequentially. I don't see how actual app code should be running in this case. Once the sequential migration steps have finished, only then should the app start running.

In my view the migration code & the app code should be 2 separate entities, and app code can only start running once db migration (and thus db version) is up to speed with the app code

Anyway, this is my view without understanding the full codebase. Maybe I'm thinking too "simple" for this and I'm sure there are good reasons for your worry. - This is just based on some (smaller scare) experience I have.

No matter whether the above makes sense or not, thanks for your project :)

Floppy commented 1 month ago

Yeah, this is a Rails thing, it doesn't have that pure separation - it initializes the app when migrations are run. And some of the migrations use application code to change models around. That's the root cause, really, if all migrations were pure SQL we wouldn't have the problem; which is perhaps one solution but would take quite a bit of rework.

tr1plus commented 1 month ago

Very dumb (but from work experience I know that my sometimes dumb ideas are the ones we go for)

While (db.Version != [current_version])
       Execute migration & wait until db.version = current version
EndWhile

#Once all migrations are done, we can init app
Init app

I know this might require some rework - but you would just need to keep your version number in DB, and update it for each sequential migration step