fission / fission-workflows

Workflows for Fission: Fast, reliable and lightweight function composition for serverless functions
Apache License 2.0
371 stars 42 forks source link

Horizontal scaling of the workflow engine #184

Open erwinvaneyk opened 6 years ago

erwinvaneyk commented 6 years ago

Currently the workflow engine itself cannot be scaled horizontally. The issue that prevents this is that the workflow invocations are not assigned to specific workflow engine instance. If you scale the engine right now, the multiple workflow engines will use the event queue to fetch all active invocations and try to continue executing them all, which leads to duplicate function executions.

To fix we need the following:

  1. assign an "owner" to each invocation. This can be done by adding an owner explicitely to the workflow invocation model, or adding workflow engine namespaces to the event queue implementation
  2. a way for workflow engine to recognize orphaned invocations. If might happen that during downscaling a workflow engine instance is killed while still managing invocations. Other instances should be able to recognize this and hand-off the orphaned invocations to another workflow engine instance.

cc @thenamly

ghost commented 6 years ago

Reason for horizontally scaling workflow engine is: