HumanSignal / label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format
https://labelstud.io
Apache License 2.0
18.42k stars 2.32k forks source link

Firestore as alternative to Cloud SQL (Google Cloud Platform) #903

Open TimDeSmet opened 3 years ago

TimDeSmet commented 3 years ago

Is your feature request related to a problem? Please describe.

The one-click-deploy to GCP seem so to be flawed.

Cloud Run is ment for stateless containers only and since the default one-click-deployment uses the sqlite db the container is not stateless.

This results in the fact that all data stored in sqlite will be lost frequently (users, projects, configuration, ...).

Describe the solution you'd like

The perfect solution would be that Label Studio supported a (non-relational )document based db connection to Firestore. The combination of Cloud Run and Firestore would:

Describe alternatives you've considered

There are a few alternatives possible at the moment to work around this issue:

None of the options are as good as the proposed connection would be.

gsarti commented 3 years ago

I am presently facing the same issues regarding the connection to a Cloud Run instance, meaning that in the matter of 10 minutes everything is lost (projects, accounts, etc.)

@TimDeSmet can you share how you managed to set up the connection to Cloud SQL on GCP? Is it mandatory to use PostgreSQL and set environment variables for host etc. or is it possible to make it work through SQLite?

timdesmetML6 commented 3 years ago

@gsarti I'm sorry to hear you're in the same boat.

I've not yet deployed Label Studio using Cloud SQL (due to a lack of time and frankly the required budget). I would indeed assume that it's necessary to use the PostgreSQL approach.

I'm hopeful that the requested feature will be placed on the roadmap sooner rather than later.

makseq commented 3 years ago

As I know Firestore is noSQL db. Also I know Django supports it, but I'm not sure that all features used by LS are possible with noSQL (we use a lot of complex JOIN requests and other intricate SQL things). So, I recommend to look at Postgres, Mysql and other SQL DBs.

gsarti commented 3 years ago

@makseq thank you for the answer, are there some tutorials on how to setup Postgres on cloud providers with the one-click deploy by any chance?

makseq commented 3 years ago

We had similar question about heroku: https://github.com/heartexlabs/label-studio/issues/848#issuecomment-827854768

You can use LS postgres environment variables to setup it on your cloud.

https://labelstud.io/guide/storedata.html#PostgreSQL-database