replikativ / datahike

A fast, immutable, distributed & compositional Datalog engine for everyone.
https://datahike.io
Eclipse Public License 1.0
1.62k stars 95 forks source link

[Bug]: :config-does-not-match-stored-db for file storage when using VPN #661

Open claj opened 6 months ago

claj commented 6 months ago

What version of Datahike are you using?

0.6.1557

What version of Java are you using?

"17.0.8.1"

What operating system are you using?

MacOS

What database EDN configuration are you using?

{:store {:backend :file
                         :path "datahike2"}
                 :attribute-refs? true}

Describe the bug

I start the (emacs nrepl-)project as usual and initiates the database. I confirmed that I can start the project and connect to the database.

I exit the project.

I turn on WireGuard, a VPN service.

I start the same project and attempt to start the project, and get the following error (which I partly formatted). The difference is the IP adress, "127.0.0.1" seems to be the config of the stored database. The local IP adress from the config "192.168.1.121" is my computers IP on the local network.:

ERROR [datahike.connector:130] - Configuration does not match stored configuration.

{:type :config-does-not-match-stored-db,
 :config
 {:keep-history? true,
          :search-cache-size 10000,
          :index :datahike.index/persistent-set,
          :store [:file "127.0.0.1" "datahike2"],
          :store-cache-size 1000,
          :attribute-refs? true,
          :crypto-hash? false,
          :schema-flexibility :write,
          :branch :db},

 :stored-config
 {:keep-history? true,
  :search-cache-size 10000,
  :index :datahike.index/persistent-set,
  :store [:file "192.168.1.121" "datahike2"],
  :store-cache-size 1000,
  :attribute-refs? true,
  :crypto-hash? false, :schema-flexibility :write, :branch :db},

 :diff ({:store [nil "127.0.0.1"]} {:store [nil "192.168.1.121"]} {:keep-history? true, :search-cache-size 10000, :index :datahike.index/persistent-set, :store [:file nil "datahike2"], :store-cache-size 1000, :attribute-refs? true, :crypto-hash? false, :schema-flexibility :write, :branch :db})}
Execution error (ExceptionInfo) at datahike.connector/ensure-stored-config-consistency (connector.cljc:130).
Configuration does not match stored configuration.

What is the expected behaviour?

I expected the file storage to be happily unaware of which IP the JVM resolved as current IP.

For me it is unexpected that a local file storage is connected to what IP adress datahike deduced that I use.

How can the behaviour be reproduced?

Probably by using WireGuard and VPN service as described above.

According to this clojurians #datahike slack thread the behaviour is likely related to the newly introduced global adress space.

whilo commented 6 months ago

Hey @claj ! Thanks for reporting. Can you add a :scope to your file store configuration explicitly? If you don't then it is inferred as your IP, but this is subject to changes depending on your network configuration. It is supposed to disambiguate filesystems (stores), so stores that point to the same path, but are on different machines are not considered to be the same.

whilo commented 6 months ago

Lmk if this fixes the issue (it should).

claj commented 6 months ago

Sorry, lost the ticket. adding :scope "abc" works. Thank you! Adding this to the Readme would be great? And also maybe explaing, what a :scope is. Could not find any occurence when looking this up in the datahike repo,

whilo commented 6 months ago

Cool, that it works! Yes, it definitely needs to be added to the README.

psagers commented 5 months ago

I've encountered this in development, but so far I've been lucky in production. Although I just encountered another scenario. I copied a standalone production database to my dev machine to experiment and not only did I need to force the scope, I have to make a symlink from the production database path (e.g. /var/db/my-app) to the local development path. I got it working, but it's pretty icky.

Clearly there are some deployments/scenarios in which this part of the config matching is all hassle and no value. Some kind of "I know what I'm doing, just read the data" switch would be very helpful. And/or an easy mechanism to update the stored config to match a new environment.