neo4j / docker-neo4j

Docker Images for the Neo4j Graph Database
Apache License 2.0
328 stars 171 forks source link

What neo4j data files should be ignored when persisting database in version control? #98

Open glevine opened 7 years ago

glevine commented 7 years ago

I have persisted my database by mounting the data volume in my docker-compose file. I am committing the database to git with the idea that as I develop, any changes to my database can be committed as dev data for the next time I start the container, no matter what machine I'm on.

Every time I run docker-compose stop, I notice that the following files have changed.

modified:   data/databases/graph.db/neostore.id
modified:   data/databases/graph.db/neostore.labeltokenstore.db.id
modified:   data/databases/graph.db/neostore.labeltokenstore.db.names.id
modified:   data/databases/graph.db/neostore.nodestore.db.id
modified:   data/databases/graph.db/neostore.nodestore.db.labels.id
modified:   data/databases/graph.db/neostore.propertystore.db.arrays.id
modified:   data/databases/graph.db/neostore.propertystore.db.id
modified:   data/databases/graph.db/neostore.propertystore.db.index.id
modified:   data/databases/graph.db/neostore.propertystore.db.index.keys.id
modified:   data/databases/graph.db/neostore.propertystore.db.strings.id
modified:   data/databases/graph.db/neostore.relationshipgroupstore.db.id
modified:   data/databases/graph.db/neostore.relationshipstore.db.id
modified:   data/databases/graph.db/neostore.relationshiptypestore.db.id
modified:   data/databases/graph.db/neostore.relationshiptypestore.db.names.id
modified:   data/databases/graph.db/neostore.schemastore.db.id
modified:   data/databases/graph.db/neostore.transaction.db.0

Are these the only files I should expect to change even if no changes are made to the database?

Is it safe to add them to my gitignore file or will issues arise as changes to these files further grow out of sync with the rest of the database?

Is there a better way to persist data for development?

What are the best practices if I'm not following them?

Thanks.

nicorikken commented 5 years ago

The manual recommends against backing up the data directory:

Using copy-and-paste to move the internal data directory, in order to transfer and seed databases is not supported. If you have an existing Neo4j database whose data you wish to use for a new cluster, it is recommended to create an offline backup using neo4j-admin dump. The resulting backup can then be used to seed the cluster by following the instructions in Section 5.3.3, “Seed from an offline backup”. source

Neo4j did however recently change the compression algorithm of the dump from 3.5.4 to 3.5.5, resulting in a format incompatibility :sob: :astonished: :boom: So for now, watch your Neo4j versions and I think a data directory would suit most use-cases. As this practice is not recommended anyway, I think this issue can be closed.