GoogleCloudPlatform / cloud-spanner-emulator

An open source emulator for Cloud Spanner.
Apache License 2.0
261 stars 42 forks source link

[Feature Request] Emulator persistence #33

Open mox-bot opened 2 years ago

mox-bot commented 2 years ago

It appears that the emulator only uses an InMemory storage interface. This is fine for testing but is rather inconvenient for local development with an unstable internet connection.

Is it possible to add a disk backed option to allow non ephemeral data persistence?

snehashah16 commented 2 years ago

hello,

We have made careful considerations about having a fast in-memory emulator for Cloud Spanner that can be hermetic for testing and also aid local development against Cloud Spanner.

Though, I agree with your usecase, and may I suggest the following workaround [1]:

# Export
$ spanner-dump -p ${PROJECT} -i ${INSTANCE} -d ${DATABASE} > data.sql

# Import
$ spanner-cli -p ${PROJECT} -i ${INSTANCE} -d ${DATABASE} < data.sql

[1] https://github.com/cloudspannerecosystem/spanner-dump#spanner-dump-

jrabello commented 2 years ago

that would be a very important feature, we need persistence on our local environment, when doing our day to day work

sini commented 1 year ago

Perhaps a solution with something in between by adding support for snapshot dumps and loads like Redis's RDB behavior that can be triggered by a sighup or sql protocol command? Serialization doesn't need to be fast, it's just for emulating larger systems applying migrations can take a very long time and adding dependence on spanner emulator for Integration Testing can take a long time -- although the previous proposed solution of export/import may be faster -- the ability to do so natively and keep warm deployments in docker containers would be useful.

silenceisgolden commented 1 year ago

It would be so incredibly helpful to have an automatic feature to either store the data in a volume like the main PostgreSQL container (see PGDATA - https://hub.docker.com/_/postgres) or even a feature like the Firestore emulator added to enable import on startup and export-on-exit. See https://github.com/firebase/firebase-tools/issues/2269 for the discussion around the Firestore emulator feature. This would enable so much more confidence to adopt Spanner since it could be used reliably in local development environments by engineers. Having data be lost, or have the ability to possibly lose local testing data is a huge reduction in confidence and is probably harming the chances of small teams to evaluate Spanner.

gauravpurohit06 commented 1 year ago

Please let us know if the other feature request to have a database and instance available at the emulator startup would be helpful here.

We will evaluate the options to provide the full persistence support in emulator.

silenceisgolden commented 7 months ago

I do not think that the other feature request (70) would satisfy this issue. The other request would just remove a few lines from the setup script that would run in the local env every time you ran something like docker compose up in a project. This feature request (33) would remove an entire container/script from the local environment and normalize the container's behavior to other database containers.

ralphv commented 5 months ago

We use Garden.io for ephemeral environments that create environments ranging from 8 hours to 100 hours.

K8 sometimes reschedule pods when downsizing clusters, so this breaks our environment when spanner gets redeployed. A persistent option here would be very beneficial. It is not an option to stop K8 from redeploying based on budget resources as they will break other things for our cluster.

ralphv commented 5 months ago

hello,

We have made careful considerations about having a fast in-memory emulator for Cloud Spanner that can be hermetic for testing and also aid local development against Cloud Spanner.

Though, I agree with your usecase, and may I suggest the following workaround [1]:

# Export
$ spanner-dump -p ${PROJECT} -i ${INSTANCE} -d ${DATABASE} > data.sql

# Import
$ spanner-cli -p ${PROJECT} -i ${INSTANCE} -d ${DATABASE} < data.sql

[1] https://github.com/cloudspannerecosystem/spanner-dump#spanner-dump-

This doesn't help as K8 does the rescheduling...

Containers should be immutable by design and using memory only with no persistent option breaks this.

gvastakis commented 1 month ago

I agree that this feature is really needed. Our app uses both Datastore and Spanner, and when developing locally, we often lose Spanner data after actions like restarting Docker !