Closed joshliberty closed 2 years ago
If a workflow request (with a single payload) matches multiple workflows, do you need to make multiple copies of the payload, one for each workflow/namespace?
No, the input data is a read-only so we can keep a single copy of it. The data generated within the workflow by its tasks will be stored in a subfolder
@joshliberty in the flow chart, can you explain the line "Add task to dispatch queue"? Also the last "Acknowledge" flow line too? Is there a high level design system architecture that shows how all the components interact? Thanks
@nakfour yes:
As to a high-level architecture diagram, unfortunately not yet - once we have a clearer picture of how the various components interact I'll make sure it's all documented, but at the moment it's all a bit of a work in progress.
I am very much in favor of drafting a high-level arch diagram to facilitate the discussion. A visual = 1000 words. As we make decisions, we can modify. It can be a collaborative effort so @joshliberty doesn't do all the work. WDYT?
Thanks @joshliberty I think we discussed this in our last meeting with respect to invoking a new task. If the Task Manager and Workflow Manager are part of the same process (ie in the same Pod/Container) then it does not make sense to use pub/sub for communication. Also for the "Acknowledge", what if you have multiple instances of Workflow manager and two Workflow managers read the same message. I say that because you are sending the "Acknowledge" message for RabbitMQ to clear a message after a long list of tasks.
Babar, Jack, as discussed in our meeting – for the purpose of this ticket you can make the following assumptions:
When a new workflow request is received (see #51) the workflow manager component is notified and should begin processing a new workflow.
The following has been split out into these subtasks: (note: all else is included as part of this task)
Implementation Notes
Dispatching tasks
To dispatch a task, the following steps need to be taken:
Publishing task dispatch events
Queue: md.workflow.task_dispatch
The storage information object should contain the MinIO/storage engine credentials – exact schema depends on the storage back-end used.
Acceptance criteria
Database schema
For each workflow instance, a document such as the below should be created: