michael-simons / neo4j-migrations

Automated script runner aka "Migrations" for Neo4j. Inspired by Flyway.
https://michael-simons.github.io/neo4j-migrations/
Apache License 2.0
115 stars 23 forks source link

Migrations fail when there is a missing migration file that has already been applied. #867

Closed sibethencourt closed 1 year ago

sibethencourt commented 1 year ago

The current situation is the following:

We are currently running the neo4j-migrations as part of our release using the springboot integration, so migrations run on application bootstrap.

We had to do a rollback of our application because a bug, and we found that the application couldn't start, the migration phase failed. Turns out, the version we wanted to rollback didn't have one migration file that was introduced in the new version which was making the migration fail completely. At the end we had to move forward we couldn't do the rollback. We got this error:

Caused by: ac.simons.neo4j.migrations.core.MigrationsException: More migrations have been applied to the database than locally resolved.
    ...
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1800)
    ... 16 common frames omitted

Debugged it up to here https://github.com/michael-simons/neo4j-migrations/blob/main/neo4j-migrations-core/src/main/java/ac/simons/neo4j/migrations/core/ChainBuilder.java#L98 Which now it makes sense why it fails. Is trying to find all applied migrations in database on disk. So if one file disappear it will produce the index out of bounds.

Is there a way to allow running migrations with local migrations missing? I understand, in many cases the migrations won't be ran as part of the release, so probably for most people this is not a problem?

Another use case I've found where this could be useful is to delete java migrations. We make java migrations to fix pieces of data in our database, once the data has been fixed, I don't want to keep that migration anymore, I want to delete the code since the migration should never run again, but right now it seems impossible.

Is there anything you recommend to solve these use cases?

michael-simons commented 1 year ago

Hey, nice seeing you again.

I'm inclined to say "works as intended".

I could think of adding a flag "skip non existing" but that would defeat the purpose of the whole thing to some extend in my view.

To help you on your current situation, I could see some options:

I tried the first suggestion with my test dataset on enterprise edition:

https://github.com/michael-simons/neo4j-migrations/tree/main/neo4j-migrations-test-resources/src/main/resources/catalogbased/actual-migrations

pwd
# ~Projects//neo4j-migrations/neo4j-migrations-test-resources/src/main/resources/catalogbased/actual-migrations

tree
# ➜  actual-migrations git:(main) tree
# .
# ├── V10__Assert_empty_database.xml
# ├── V20__Apply_complete_catalog.xml
# ├── V30__Drop_non_existing_constraints_idempotent.xml
# ├── V35__create_data.cypher
# ├── V40__create_some_stuff.xml
# ├── V45__execute_some_advanced_refactorings.xml
# ├── V50__Verify_and_allow_equivalence.xml
# ├── V60__Assert_current_catalog.xml
# ├── V65__Look_at_days_of_future_past.xml
# └── V70__Clean_everything.xml
# 
# 0 directories, 10 files

neo4j-migrations -uneo4j -psecret --location=file:`pwd`  info                                                                             
# 
# neo4j@localhost:7687 (Neo4j/4.4.12 Enterprise Edition)
# Database: neo4j
# 
# +---------+------------------------------------------+---------+--------------+----+----------------+---------+---------------------------------------------------+
# | Version | Description                              | Type    | Installed on | by | Execution time | State   | Source                                            |
# +---------+------------------------------------------+---------+--------------+----+----------------+---------+---------------------------------------------------+
# | 10      | Assert empty database                    | CATALOG |              |    |                | PENDING | V10__Assert_empty_database.xml                    |
# | 20      | Apply complete catalog                   | CATALOG |              |    |                | PENDING | V20__Apply_complete_catalog.xml                   |
# | 30      | Drop non existing constraints idempotent | CATALOG |              |    |                | PENDING | V30__Drop_non_existing_constraints_idempotent.xml |
# | 35      | create data                              | CYPHER  |              |    |                | PENDING | V35__create_data.cypher                           |
# | 40      | create some stuff                        | CATALOG |              |    |                | PENDING | V40__create_some_stuff.xml                        |
# | 45      | execute some advanced refactorings       | CATALOG |              |    |                | PENDING | V45__execute_some_advanced_refactorings.xml       |
# | 50      | Verify and allow equivalence             | CATALOG |              |    |                | PENDING | V50__Verify_and_allow_equivalence.xml             |
# | 60      | Assert current catalog                   | CATALOG |              |    |                | PENDING | V60__Assert_current_catalog.xml                   |
# | 65      | Look at days of future past              | CATALOG |              |    |                | PENDING | V65__Look_at_days_of_future_past.xml              |
# | 70      | Clean everything                         | CATALOG |              |    |                | PENDING | V70__Clean_everything.xml                         |
# +---------+------------------------------------------+---------+--------------+----+----------------+---------+---------------------------------------------------+

neo4j-migrations -uneo4j -psecret --location=file:`pwd`  apply
# [2023-02-19T20:55:31.248427000] Applied migration 10 ("Assert empty database").
# [2023-02-19T20:55:32.367346000] Applied migration 20 ("Apply complete catalog").
# [2023-02-19T20:55:32.907764000] Applied migration 30 ("Drop non existing constraints idempotent").
# [2023-02-19T20:55:33.817671000] Applied migration 35 ("create data").
# [2023-02-19T20:55:35.958907000] Applied migration 40 ("create some stuff").
# [2023-02-19T20:55:36.771924000] Applied migration 45 ("execute some advanced refactorings").
# [2023-02-19T20:55:36.890588000] Items in the database are not identical to items in the schema catalog. The following items have different names but an equivalent definition:
# * Database item `a_local_duplicate_equivalent_to_the_catalog_one` matches catalog item `unique_isbn`
# [2023-02-19T20:55:37.463177000] Applied migration 50 ("Verify and allow equivalence").
# [2023-02-19T20:55:37.579662000] Applied migration 60 ("Assert current catalog").
# Future indexes have been created. Use the following statements to drop all BTREE based constraints and indexes:
# DROP CONSTRAINT unique_isbn IF EXISTS;
# DROP INDEX book_title IF EXISTS;
# DROP INDEX a_new_book_index IF EXISTS;
# [2023-02-19T20:55:38.431933000] Applied migration 65 ("Look at days of future past").
# [2023-02-19T20:55:38.592973000] Applied migration 70 ("Clean everything").
# Database migrated to version 70.

rm V45__execute_some_advanced_refactorings.xml 

neo4j-migrations -uneo4j -psecret --location=file:`pwd` info  
# Unexpected migration at index 5: 50 ("Verify and allow equivalence").

I then used cypher shell, but browser will work just fine:

This is the offending migration who's file I deleted:

match (p)--> (d:`__Neo4jMigration` {version: '45'})-->(n) return p,d,n

It can be fixed as follows:

match (p)-[r]-> (d:`__Neo4jMigration` {version: '45'})-[l]->(n)
create (p) -[nr:MIGRATED_TO] -> (n)
set nr = properties(r)
delete r, d, l
return p, nr, n

Matches, creates a new MIGRATED_TO from the prior to the follower, copies the properties and deletes the old stuff.

After that, info works as expected

neo4j-migrations -uneo4j -psecret --location=file:`pwd` info

# neo4j@localhost:7687 (Neo4j/4.4.12 Enterprise Edition)
# Database: neo4j
# 
# +---------+------------------------------------------+---------+-------------------------------+---------------+----------------+---------+---------------------------------------------------+
# | Version | Description                              | Type    | Installed on                  | by            | Execution time | State   | Source                                            |
# +---------+------------------------------------------+---------+-------------------------------+---------------+----------------+---------+---------------------------------------------------+
# | 10      | Assert empty database                    | CATALOG | 2023-02-19T19:55:30.628Z[UTC] | msimons/neo4j | PT0.245S       | APPLIED | V10__Assert_empty_database.xml                    |
# | 20      | Apply complete catalog                   | CATALOG | 2023-02-19T19:55:32.074Z[UTC] | msimons/neo4j | PT0.584S       | APPLIED | V20__Apply_complete_catalog.xml                   |
# | 30      | Drop non existing constraints idempotent | CATALOG | 2023-02-19T19:55:32.662Z[UTC] | msimons/neo4j | PT0.222S       | APPLIED | V30__Drop_non_existing_constraints_idempotent.xml |
# | 35      | create data                              | CYPHER  | 2023-02-19T19:55:33.778Z[UTC] | msimons/neo4j | PT0.804S       | APPLIED | V35__create_data.cypher                           |
# | 40      | create some stuff                        | CATALOG | 2023-02-19T19:55:35.379Z[UTC] | msimons/neo4j | PT1.151S       | APPLIED | V40__create_some_stuff.xml                        |
# | 50      | Verify and allow equivalence             | CATALOG | 2023-02-19T19:55:36.710Z[UTC] | msimons/neo4j | PT0.628S       | APPLIED | V50__Verify_and_allow_equivalence.xml             |
# | 60      | Assert current catalog                   | CATALOG | 2023-02-19T19:55:37.563Z[UTC] | msimons/neo4j | PT0.082S       | APPLIED | V60__Assert_current_catalog.xml                   |
# | 65      | Look at days of future past              | CATALOG | 2023-02-19T19:55:38.397Z[UTC] | msimons/neo4j | PT0.793S       | APPLIED | V65__Look_at_days_of_future_past.xml              |
# | 70      | Clean everything                         | CATALOG | 2023-02-19T19:55:38.559Z[UTC] | msimons/neo4j | PT0.091S       | APPLIED | V70__Clean_everything.xml                         |
# +---------+------------------------------------------+---------+-------------------------------+---------------+----------------+---------+---------------------------------------------------+

I'll keep this open, maybe adding the skip missing feature or repair missing feature. While thinking about it: A feature to delete one or more migrations as shown above would probably super helpful (also for the class based scenario) and actually easier to add and easier to understand.

Wdyt @sibethencourt ?

sibethencourt commented 1 year ago

Hi Michael, I'm bringing back some additional thoughts on the issue :D

I think for the first use case of wanting to do a rollback, probably the best solution would be to separate the migrations from the releases, which is already a good practice, and we will have to do it at some point anyway.

For the second use case of wanting to delete stale/deprecated data/java migrations, having some help from the CLI would definitely help.

Deleting the whole chain will cause everything to run again, the indexes most likely be fine. But all the data migrations running all again at the same time feels counterproductive.

Deleting specific nodes makes more sense, but I'll have to balance the risk of going to the prod database to run manual queries that could eventually break something. I could create my own script though... But obviously it could eventually break compatibility with neo4j-migrations and cause issues...

Also, what would be the right mechanism to reconcile the files and the database? I imagine there will be cases like:

I created an index a while ago, but it's not used anymore so I want to delete it. I create a new migration to get rid of the index. Now, do I need to keep these files forever? It means that every time I'm creating a new database for testing (maybe hundreds of times per day) I'm creating an index, and deleting it right after in a following migration.

Eventually what I would like to do is to delete the migrations that creates and deletes the old index and forget about it. That would be the main reason why I delete files from the migration.

Things that I would like to be protected that could conflict with not failing when deleting fails:

That's more or less my thoughts on this, do we know how flyway and liquibase handle these situations?

michael-simons commented 1 year ago

Hey.

Flyways repair will mark missing migrations as deleted (https://documentation.red-gate.com/fd/repair-184127461.html) which is probably a sane and inspirational way to deal with this.

Liquibase: IDK.

Did you try my suggested work around?

sibethencourt commented 1 year ago

hi,

I did try deleting one specific node to confirm my hipotesis and it works perfectly. But I won't be able to propose that as a long-term solution for production, since we don't want to manually modify the database, one of the reasons why we use neo4j-migrations ;)

michael-simons commented 1 year ago

Thanks and fwiw, totally understandable. I’m a bit swamped atm but I’m gonna work on this.

michael-simons commented 1 year ago

Hey @sibethencourt while adding a delete feature, I noticed that you need to query a bit different if it affects a migration at the end of the chain

MATCH (p)-[l:MIGRATED_TO]->(m:__Neo4jMigration {version: $version})
OPTIONAL MATCH (m)-[r:MIGRATED_TO]->(n:__Neo4jMigration)
WHERE coalesce(m.migrationTarget, '<default>') = coalesce($migrationTarget,'<default>')
WITH p, l, m, r, n
FOREACH (i in CASE WHEN n IS NOT NULL THEN [1] ELSE [] END |
    CREATE (p)-[nl:MIGRATED_TO]->(n)
    SET nl = properties(l)
)
DELETE l, m, r

This is gonna come to migrations, also a full repair feature.

michael-simons commented 1 year ago

Hey @sibethencourt We have added two new operations, delete and repair and I hope they work for you. I would really appreciate your feedback. Release of 2.2.0 is due today.

You can use those features from the CLI too, in case you're still on 1.x in your application. While I don't want to back part them, it's totally ok to use 1.x in your app and use the repair feature from a 2.x CLI version.