The purpose of this issue is to add the ability to override binaries version via configuration passed in workflows input, propagated to each activity task that uses the binary.
1/ Passing the binaries version
This is an opaque topic for simpleflow. YMMV. At Botify we'd do it via the workflow input or via access to a datastore.
2/ Scheduling the activity task with some meta keys
This is already possible since Kubernetes support was merged. The concern here is to be able to pass some runtime configuration without changing the internal function signature, which can be a pain.
The activity task would receive:
{
# args and kwargs are passed directly to the task
"args": [ 1, 2, 3, 4 ],
"kwargs": { "foo": "bar" },
# meta is *not* passed to the task function, and not taken into account when calculating the task name
"meta": {
"binaries": {
"mybin": "s3://a_bucket/path/to/mybin"
}
}
}
Only the binaries used by the given task should be specified.
On the decider side, the code would look like:
from simpleflow import activity, Workflow
def a_task_using_mybin():
pass
class MyWorkflow(Workflow):
def run(self, binaries_dict):
# schedule a task with a custom version of "mybin"
mybin_location = binaries_dict["mybin"] # "s3://a_bucket/path/to/mybin" for instance
decorated = activity.with_attributes(a_task_using_mybin, meta={"binaries": {"mybin": mybin_location}})
self.submit(decorated, ...)
As it's not convenient to decorate tasks lately like this, we'll find a solution to make the resolution of this lazy. We already did this elsewhere but I don't remember how.
3/ Activity worker: Lazily download the binary
Simpleflow will download those binaries if not present in a /usr/local/bin/<binary name>-<location hash>/ and prepend the directory to PATH before forking to process the activity task.
More complex strategies can be added later, but I like that the default one is just a { binary: location } dict (if you don't like it, tell me!)
As discussed with others, "/usr/local/bin" should be configurable. Some might prefer "/tmp", others "/opt", ... It only has to be configured/resolved on activity workers.
The purpose of this issue is to add the ability to override binaries version via configuration passed in workflows input, propagated to each activity task that uses the binary.
1/ Passing the binaries version
This is an opaque topic for simpleflow. YMMV. At Botify we'd do it via the workflow input or via access to a datastore.
2/ Scheduling the activity task with some meta keys
This is already possible since Kubernetes support was merged. The concern here is to be able to pass some runtime configuration without changing the internal function signature, which can be a pain.
The activity task would receive:
Only the binaries used by the given task should be specified.
On the decider side, the code would look like:
As it's not convenient to decorate tasks lately like this, we'll find a solution to make the resolution of this lazy. We already did this elsewhere but I don't remember how.
3/ Activity worker: Lazily download the binary
Simpleflow will download those binaries if not present in a
/usr/local/bin/<binary name>-<location hash>/
and prepend the directory toPATH
before forking to process the activity task.More complex strategies can be added later, but I like that the default one is just a
{ binary: location }
dict (if you don't like it, tell me!)