liquibase / liquibase-neo4j

Neo4j extension for Liquibase
Apache License 2.0
49 stars 11 forks source link

generateChangeLog not working #285

Open Magn1fico opened 1 year ago

Magn1fico commented 1 year ago

Hello everyone,

I'm making first steps in liquibase plugin for neo4j and I'm struggling with generating the initial changeLog file for the existing neo4j database.

I run the following command: liquibase --changeLogFile=mydatabase_changelog.xml generateChangeLog

The error I get is: Unexpected error running Liquibase: Don't know how to query for sequences on neo4j @ jdbc:neo4j:neo4j+s://_databaseaddress_

At the same time, I am able to connect to the database and run a simple changelog which creates a node as per quick start guide, so my guess it is not a connectivity issue.

Can anyone adivse, please, if generateChangeLog functionality working with neo4j?

I am using liquibase-neo4j-4.17.2.1 plugin version and neo4j-jdbc-driver-4.0.6 for connection

fbiville commented 1 year ago

Hello, thanks for the report, I'll need to estimate how wide the gap is to support that functionality. I'll let you know ASAP.

fbiville commented 1 year ago

Hello, @mgazanayi and I had a look and it appears generateChangeLog will not be easy to support due to the nature of Neo4j.

TL;DR: the best we can do is generate the index/constraint changes to match those detected in the database. It would be best to wait for the availability of the createIndex/createConstraint changes.

Long version ⬇️

Relational databases have a strict separation of structure and data. You must first define structure (table, columns) before inserting data into it. These structures can be reliably retrieved and that's what generateChangeLog does: get the existing structures and generate the corresponding Liquibase changes for it.

With Neo4j, there's no strict separation between structure and data. They are tied together. You create the data directly, which may or may not adhere to some common schema (defined by constraints beforehand or after the fact).

I don't think it would be a reasonable implementation if we dumped the whole data into a Liquibase change set as a result of generateChangeLog. The data set could be huge, leading to memory issues and probably for no added value (unless most of the nodes/relationships have their own schema - which is not true, most of the time).

If we exclude data, the only artefacts we have left are indices & constraints. We could inspect the target Neo4j database and generate the Liquibase changes that would in turn create indices & constraints when run.

Currently, this is done with raw Cypher queries. @mgazanayi and I just started looking into implementing higher-level constructs that support indices/constraints creation & drop in a Neo4j version-agnostic manner. We want to focus on that first.

Magn1fico commented 1 year ago

Hello @fbiville!

Thank you very much for looking into my question, and for such a detailed explanation.

At least I am now sure that this is not implemented, and not me doing something wrong.

I guess I'll just repeat the data modeling steps on a new Neo4j instance from scratch.

Cheers!

fbiville commented 1 year ago

If you can share of your manual work here, that would be super insightful for us! We would love to learn more about the use cases of generateChangelog for Neo4j.