botpress / v12

Botpress OSS – v12
https://v12.botpress.com
GNU Affero General Public License v3.0
68 stars 83 forks source link

[FEATURE] Migrations should not depend on botpress.config.json - [BUG] Migrations are unexpectedly skipped #1335

Closed JustusNBB closed 2 years ago

JustusNBB commented 3 years ago

Is your feature request related to a problem? Please describe. When updating our Botpress Stack, we bump the version number in the Dockerfile and watch it migrate after build and reboot, then there is additional changes (mostly concerning config schema) to be committed alongside, including the version number bump in the botpress.config.json.

Putting this into a single commit produces unexpected behavior on other developers workstations, due to how BPFS=disk works, migrations will be skipped.

Describe the solution you'd like Migrations should not depend on a version number in some config file, but rather be self-consistent, independent from the injected default config and its version. Migrations should not be skipped because of the(/a wrong) config dependency.

Botpress needs to assume differently whether migration is necessary at all

Describe alternatives you've considered Noone yet, this is a weird stumbling block and overhead in development (using docker-compose)

Additional context Tested and had skipped migrations with versions 12.19 to 12.21!

This concerns mostly/only development. In production (BPFS=database) migration and configuration files are ignored in favor of the existing configuration in the database - I haven't tried to reproduce the bug there, should I to be sure?

allardy commented 2 years ago

I'm still thinking of what the best mechanism should be here.

There are two different version:

Table srv_migrations only serves to track logs and what was migrated (to display in the diag report). The reason we have the two different versions is that if you update the content locally, then commit on github, then the content will be updated, but not the database structure. It will be updated on the next run.

I don't understand how you end up with the version bump in the botpress.config.json file to commit if you have BPFS=database, since it should update the file in the database directly.

If you run the migration with BPFS=disk, then it should update the content files along with the configuration file, and since all those files will be committed, the migration would technically not have to be executed on the dev's computer.

I'm not entirely sure to understand the development flow.

A quick workaround would be to set TESTMIG_ALL=true on the developer .env file to run all migrations. Another solution would be to simply try to execute all migrations on every startup, instead on relying in the version of the config file, but that would mean we really must ensure that each migrations must correctly check if it has to run.

@EFF what do you think of this approach ? I thought about it a couple of times before, we could just try to run all migrations everytime, if none needs to be executed, then we could be silent about it (not display a log at all), but show a log if > 1 needs to run. We would need to refactor each existing migrations tho, but it would be safer

JustusNBB commented 2 years ago

I don't understand how you end up with the version bump in the botpress.config.json file to commit if you have BPFS=database, since it should update the file in the database directly.

I don't know if this issue can be reproduced with BPFS=database, non-dev deployments should not be affected.

If you run the migration with BPFS=disk, then it should update the content files along with the configuration file, and since all those files will be committed, the migration would technically not have to be executed on the dev's computer.

I'm not entirely sure to understand the development flow.

I am also not sure about my development flow, since I am constantly running into migration issues with Botpress :|

The issue appears after I migrate my local deployment, then we also want to update my colleague's local deployments - currently this requires checking out only the version bump commit, to then startup to manually check local migration results against the ones in the following commit to be safe. The migrations folder is gitignored, so this concerns mostly files like botpress.config.json.

It will be updated on the next run.

So there is two different migration mechanisms that have different triggers? Very interesting. I did not actually check how broken the setup is after skipped migrations with the file/content premigrated. The database will self-migrate anyways after the second restart? This mechanic could explain some other dubious migration issues I had (sorry having trouble to nail those down, that why this remains unspecific).

A quick workaround would be to set TESTMIG_ALL=true on the developer .env file to run all migrations.

This sounds like a good workaround, if that works it only needs to be documented ;)

that would mean we really must ensure that each migrations must correctly check if it has to run.

Yes that is what I would usually expect from safe migrations.

allardy commented 2 years ago

The issue appears after I migrate my local deployment, then we also want to update my colleague's local deployments - currently this requires checking out only the version bump commit, to then startup to manually check local migration results against the ones in the following commit to be safe. The migrations folder is gitignored, so this concerns mostly files like botpress.config.json.

Just to be sure, when you update your local deployment, do you commit the whole data folder, or just part of it, like the global one ? That would give me a good hint of the issue.

The database will self-migrate anyways after the second restart? This mechanic could explain some other dubious migration issues I had (sorry having trouble to nail those down, that why this remains unspecific).

Yep. Previously we had a single version for database and content changes, but that proved to be problematic when using a CI/CD because the version used was in the botpress.config.json file for both, so if you committed it, the database would not be migrated correctly.

Yes that is what I would usually expect from safe migrations.

Each migration already check if it needs to run beforehand, but we opted to not run them all, all the time so to not pollute logs with "no migration needs to run". But I'm thinking of a better way to implement this

JustusNBB commented 2 years ago

Just to be sure, when you update your local deployment, do you commit the whole data folder, or just part of it, like the global one ? That would give me a good hint of the issue.

Yes that is the root of it, I had been selectively committing relevant files in data/global and exactly those (mostly botpress.config.json) are the cause of this, I will try to remove data/global from version control for now, that should be another workaround!

JustusNBB commented 2 years ago

Confirming that the workaround/recommendation to keep data/global out of version control works. Adding additional BP_CONFIG_* overrides is the solution, in this case I added e.g. BP_CONFIG_SUPERADMINS. This way, the Botpress configuration (and versioning) is not shared between deployments through files.