kuzzleio / kuzzle

Open-source Back-end, self-hostable & ready to use - Real-time, storage, advanced search - Web, Apps, Mobile, IoT -
https://kuzzle.io
Apache License 2.0
1.44k stars 124 forks source link

Framework: allow the execution of code only once #1999

Closed scottinet closed 3 years ago

scottinet commented 3 years ago

Feature Description

When an application starts for the first time, it's commonplace for it to have some init tasks to perform, such as setting up the storage mappings, perhaps a few structures to build, some data to fetch from distant APIs, and so on.

There is no easy way at the moment to ask Kuzzle to execute something "only once", meaning once per environment, whatever the number of Kuzzle nodes started on a cluster, and never executed again as long as the environment isn't entirely reinstalled.

Example Use Case

Loading mappings into Elasticsearch to set up a storage environment.

Possible Solutions

The technical details themselves are pretty standard: use mutexes to prevent concurrent executions if multiple nodes are executed at the same time, and store a key into Redis once the task has been successfully executed to prevent further attempts at it.

Now, we have 2 possible ways of proposing that feature to Kuzzle users:

  1. Add a new app.init method: this is an optional method. If set, its content will be executed only once, during the entire lifetime of an environment
    • Pros: easier to use and to document
    • Cons: means that new init tasks cannot be added on an existing environment without executing all other tasks (and will need some kind of "start the init phase again" mechanism)
  2. Add a new method (proposal: kuzzle.doOnce('id', <handler function>)), that will execute the handler attached to the provided identifier only once per environment, during its lifetime
    • Pros: tasks can be added and dropped on the fly on an existing environnement, during its lifetime (more flexible)
    • Cons: less intuitive than a app.init method
etrousset commented 3 years ago

In the 1 solution, I would rather call the method app.deploy, as it is kind of a mater off deploying the app for a fresh install or after a evolution.

How would you handle subsequent deploiement of the application? The need for this method might change whether you are doing a fresh install or a upgrade from a prior version?

scottinet commented 3 years ago

I like app.deploy

And yes, the downside of that solution is the same one than I expose in its "cons" section. I personally think that the doOnce method is more suited, since it allows new "only once" tasks to be added during the lifetime of an environment.

etrousset commented 3 years ago

app.deploy('versionId', )

scottinet commented 3 years ago

well, then it's the kuzzle.doOnce solution, only renamed kuzzle.deploy (which sounds better indeed) :grin:

Aschen commented 3 years ago

I also like the app.deploy option :+1:

Maybe we could introduce a new application version concept to allows users to increment a version number for new deployments. Kuzzle could refuse to start if the version number has not been increased (this will be an option and disabled by default)

scottinet commented 3 years ago

Your proposal introduces rules that force developpers using that feature in one way only. Moreover, least I didn't understand your proposal correctly, it doesn't solve the problem of evolving applications in production: you don't want to deploy previous, already deployed code.

The app.deploy('<id>', <handler>) way is explicit, easy to understand, easy to document, and versatile. id can be anything that app devs want: this could be some name used to fragment otherwise huge handlers into smaller bits for readability, or it could be the application version to make it more tracable. Or both. Or neither of those.

Aschen commented 3 years ago

I'm not sure I understand how this feature will be used then.

For example if I want to deploy mappings only on a new deployment:

app.deploy('mappings-2.4.3', () => {})

Then for the next release I will have to update manually this line before deploying:

app.deploy('mappings-2.4.4', () => {})

Is it correct?

scottinet commented 3 years ago

That's not how I envision this feature. It's more like this:

  1. Deploy new mappings
app.deploy('mappings-2.4.3', () => {})
  1. Deploy other, new mappings in another version: mappings deployed in the version 2.4.3 must not be replayed when upgrading an existing environment, since because it's an upgrade, they are already there:
app.deploy('mappings-2.4.4', () => {})

If you need to fix earlier deployments, you'll need to have some kind of upgrade procedure, which will appear in deployments only as the fixed version. For instance, if mappings deployed in version 2.4.3 were erroneous, you'd fix them in your handler to make sure they are correct the next time you install a new Kuzzle environment from scratch, but they won't be replayed automatically by Kuzzle, which is what you want: when upgrading your env to fix your mappings, you'll need to have some kind of data migration upgrade to fix them.

Aschen commented 3 years ago

My use case was to deploy mappings templates stored in the database to every tenants only when the application is deployed.

For me it should be more something like:

app.deploy(app.version, () => {})

I don't really like your solution because we will end up with a file with a bunch of app.deploy calls. I think we should rather have a proper migration system for this.

I wanted to introduce and force the concept of application version so we can execute actions when a new version of the application is deployed. For example this feature could help to trigger an hypothetical mappings migration system to automatically play new migrations only when the first node is booting.

scottinet commented 3 years ago

What about building new environments? And with your solution, does that mean that previous deployments are, by definition, obsolete code?

It seems we need a workshop on that one. :thinking:

scottinet commented 3 years ago

After discussing this issue with Adrien, we settled for the solution detailed here: https://github.com/kuzzleio/kuzzle/issues/1999#issuecomment-792600212