TryGhost / Ghost

Independent technology for modern publishing, memberships, subscriptions and newsletters.
https://ghost.org
MIT License
47.42k stars 10.35k forks source link

Migrations: reset & improvements #7489

Closed ErisDS closed 7 years ago

ErisDS commented 8 years ago

Migration Refactoring

Migrations need revisiting for the alpha, they've currently effectively been removed, only a fresh DB population will work. We are going to refactor some parts of our migration system. The goal is to extract the execution of migration out of Ghost. The Ghost-CLI tool will help us achieving our goal.

Using third party tools

https://github.com/db-migrate/node-db-migrate

Cons:

Pros:

Cons:

We already have a full working migration system, i personally don't see any reason to switch to a third party tool. Migration functionality like seed, create, up, down are in general not hard to implement. The whole wrapping logic around is important.

I am open for opinions. We can also make an experimental testing with knex migrate!

Creating the database

Right now Ghost never controlled creating the database (in case of mysql).

I would love to get support for that.

So before populating the database (see next section), we could execute a command schema.commands.createDatabase (notation might change).

Mysql will execute a raw query to the database, sqlite3 will just skip.

We can control with which encoding we create the database. If database exists, it just skips.

Populating the database

When Ghost-CLI installs a Ghost Version and starts Ghost, it can use an internal Ghost-CLI command before starting the application to check if the database is initialised. This command can be used as single shell command as well.

ghost db —init

Will create all needed tables.

If database is already populated, it just throws/shows an error.

The code for running db —init will still live in Ghost.

So it's for now actually a shortcut for node core/server/data/migration/bin/runner (or similar).

Running populations in a transaction

We need that. We had trouble in the past, that a database was in a broken state, because a container was destroyed while running the populations. (already PR'd). This is also important for LTS . Does not work in mysql, but we have a workaround PR for it.

Seed the database

In Ghost we have a set of initial data, which we call fixtures. (Welcome post, owner, permissions). This can/should be treated as a migration. It is the first migration for Ghost.

Running migrations

Options for inter process communication:

Fork process and communicate via node communication channel.

https://github.com/weixiyen/messenger.js

https://github.com/RIAEvangelist/node-ipc

When Ghost-CLI starts Ghost, it can receive an event.

Ghost can trigger an event like process.emit('started') (or similar) , which Ghost-CLI listens on. So when Ghost is started, Ghost-CLI can execute an internal command to execute migrations. This command can be used as single shell command as well.

Ghost-CLI executes the migrations and if migrations needs to happen, it will send an event to the Ghost child process to boot into maintenance mode.

If finished, it will send another event to boot into normal mode. The whole inter process communication is for now optional, we can also just hack it so it works.

ghost db —migrate

(or even ghost db —migrate —version XXX )

Database versions

Right now we have used the naming pattern 001 - 00X .

Database versions and Ghost source code are very tide together.

Each Ghost Version depends on a specific database schema. So there is only a handful database versions per Ghost Version.

I would keep numbers, but extend/change them a bit:

1.0-1 - is the first migration for Ghost Version 1.0.X

1.0-2 - is the second migration for Ghost Version 1.0.X

1.1-1 - is the first migration for Ghost Version 1.1.X

We don't do migrations for patch updates, so we don't need the whole Ghost version.

Combine fixtures and normal migrations into one folder.

Yes i think that is a good idea.

Make Ghost updates smarter!

Right now, if an error occurs while updating from 1.0.0 to 1.10, your database rolls back to 1.0.0-2 and you cannot start Ghost anymore in worst case.

This can be avoided by making the update process different:

When updating a Ghost Version via Ghost-CLI (ghost update ), we are building a new folder with the new version next to your current version.

We copy the database (in case of sqlite), the config files and the content folder. Then Ghost-CLI will bring the current (!) process into maintenance mode to avoid getting new data. Then Ghost-CLI will execute the migrations on the new folder, if available. If it was successful, the new current version is the new Ghost Version. If it was not successful, database commands rollback and current version stays. So the blog will still run in old version without any trouble and same source code.

Auto updates for Ghost: In case of auto updates, this strategy won't work.

Migrate down?

You have Ghost running with version 1.5.0. And you would like to use your current database with Ghost 1.4.0. So if you install Ghost 1.4.0 and you copy the database of 1.5.0, it should downgrade the database. So each migration file get's an up and down function. That is an optional idea.

Architecture changes?

Basically not so much will change. The biggest change is executing migrations and population from outside and making fixtures being a migration We will tidy up the files a bit (fixtures, bootup etc). Effort is still 2-3 days to have everything in place.

Tasks:

@acburdine I would like to hear your opinion :)

ErisDS commented 8 years ago

@kirrg001 have created this as an issue so that we can start to document what is gonna be done - feel free to edit it with tasks and details 👍

acburdine commented 8 years ago

Ok, I think the issue makes sense. Here are my initial thoughts:

At the moment, ghost-cli (more specifically the local process manager) uses a detached child process to run Ghost. By doing so, the ghost-cli process is allowed to exit and leave ghost to do its thing. Implementing this particular way of having the CLI run migrations means two things:

ErisDS commented 8 years ago

On 3rd party solutions:

The major downside to knex migrations is that it requires you to cycle through the migrations step by step in order to create a fresh DB. This is designed for long-running single-install software, rather than mass distributed & installed software like Ghost. If we switched to knex migrations in its current form, fresh installs would get slower and slower as time went on.

Additionally, as much as I like the idea of knex seed files, knex only supports the concept of there being one seed. Which is kind of annoying, because I could see us using it for both the original DB seed, and for seeding various tests.

Those are my two major reasons for not switching to it. I wrote some more stuff in this issue ages ago but I'm leaning towards keeping our tailor-made solution, after all we have it now and it doesn't need much.


On how the Ghost-CLI will interact with migrations,

The steps for the cli would be something like:

The addition of a Process.emit() on startup is a nice-to-have so that the CLI can tell if everything is ok or not.

This leaves 2 slightly tricky things:


Other things:

kirrg001 commented 8 years ago

Thanks @ErisDS !

how to determine if a migration is needed && how to run a migration

Maybe the easiest way for now: Just ignore if a migration needs to happen or not. Bring the current process into maintenance mode (restart + forward the maintenance flag as env param). Add a timeout of 2-3 seconds to ensure the process runs. Then run the migration runner of the new version (just a spawn of core/server/data/migration/runner for example). The migration runner won't run a migration if there is no one. Wait for the exit command of the migration runner.

ErisDS commented 8 years ago

The existing LTS branch has something like this, where every upgrade triggers maintenance mode. The issue I'm seeing with it, is that the first request from the blog almost always returns 503. As we release more and more often, that's going to look like a bunch of tiny outages?

We could optimise by assuming that if it's a patch update there's definitely no migration to run?

kirrg001 commented 8 years ago

We could optimise by assuming that if it's a patch update there's definitely no migration to run?

Yes can do and add more optimisations later 👍

kirrg001 commented 8 years ago

Regarding: Running populations in a transaction

It turned out it won't work for mysql, see http://stackoverflow.com/questions/4692690/is-it-possible-to-roll-back-create-table-and-alter-table-statements-in-major-sql

Having a PR open for next LTS release, see #7497. So a little workaround and we'll achieve the same result.

kirrg001 commented 8 years ago

Folder structure for database versions:

1.1

1.2

ghost commented 7 years ago

For the Migration could I just install the new Ghost and upload my theme and content folder and import the Content file via the admin panel.

ErisDS commented 7 years ago

@ItsMohammad for questions like this, I highly recommend swinging by our slack channel. This is an issue about writing a new database migration tool - that is adding and removing fields from the database, nothing to do with migrating between old & new Ghost versions. You can also read more about what we're doing in the new "alpha" version of ghost on our dev blog and in the alpha guide.

TLDR; there is currently no migration path to the alpha version of Ghost because it's not finished yet.

kirrg001 commented 7 years ago

Regarding importing/exporting databases and migrations table

I thought about it a while and there is nothing to change/add in my opinion.

Knex-migrator works like this: If your blog was initialised with ghost version 1.5, it will never execute a migration script from an older version (< 1.5).

If you import a JSON file from ghost version 1.2 and we would import the migration scripts, knex-migrator doesn't care about these entries, because the blog was initialised with 1.5. See https://github.com/TryGhost/knex-migrator/blob/master/lib/index.js#L476 Because we insert the imported data via our model layer, all data changes should be adapted automatically. Except of we have added a new field with a constraint, but i am pretty sure LTS can't handle this case.

If you import a JSON file from ghost version 1.9 and we would import the migration scripts, knex-migrator would throw an error, because you have migrations in your database, which do not exist on file system.

I have a branch ready, which can import migrations, just in case we need it.

kirrg001 commented 7 years ago

See https://github.com/TryGhost/knex-migrator

Closing this now. If we encounter any bugs, we can raise new issues.