laminlabs / lamindb-setup

Setup & configure LaminDB.
Apache License 2.0
4 stars 2 forks source link

Idempotency of `lamin init` in case `instance--...--....env` file is deleted #817

Open falexwolf opened 1 month ago

falexwolf commented 1 month ago

Assume you create a purely local instance

lamin init --storage ./my-test-data

You then run

rm ~/.lamin/instance--{my-handle}--my-test-data

Running this again

lamin init --storage ./my-test-data

will not create a new instance, but use the existing SQLite file my-test-data/{uuid}.lndb.

It does so by virtue of hashing the instance slug: https://github.com/laminlabs/lamindb-setup/blob/70167ad1f93983c0b75739d91eaf9c136809ee1b/lamindb_setup/_init_instance.py#L155

This isn't robust to renamings of the handle or the instance.

What would be better is to additionally build a check for whether an SQLite file exists in the storage location. Because it's "ugly" to have the .lndb file in the root level of the storage location rather than in the .lamindb/ folder, we should consider moving it while building this check.

We could also consider not naming it by the instance uuid, but instance just ./lamindb/_database.lndb or another 'better name'. Would need to ponder. The nice thing about naming the sqlite file with the instance uuid is that allows to work with several of these files. Hence, I'm now leaning towards keeping the current naming convention.

falexwolf commented 1 month ago

FYI @Koncopd -- something to keep in mind!

falexwolf commented 1 month ago

Moving the .lndb file should be really trivial.

I removed backward compat that I added more than a year ago and clarified an error message as the first step:

falexwolf commented 1 month ago

If we make improvements, we can also think about storing the lamindb_version in a non-breaking way.