Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
The complicated retry/ack dance between WA and CA is meant to ensure the proper sequencing of status writes. A more straightforward way to accomplish this would be to use CRDTs as per Kuhn and Allen Reactive Design Patterns. This way we don't have to care about the order of the writes, executions will always end up with the right status no matter which write attempt happens last.
A superior sequel to this could use the EXECUTION_INFO table in an event sourcing-like append-only manner to record all statuses and the times at which they were generated within Cromwell. A database view could present an image of all this that looked the same as the EXECUTION table does now, except we'd have conflict-free writes, simplified Cromwell internals, and event data in the DB that ops currently has to scour logs to find. But the first paragraph here is a prerequisite for all this and a good place to start.
The complicated retry/ack dance between WA and CA is meant to ensure the proper sequencing of status writes. A more straightforward way to accomplish this would be to use CRDTs as per Kuhn and Allen Reactive Design Patterns. This way we don't have to care about the order of the writes, executions will always end up with the right status no matter which write attempt happens last.
A superior sequel to this could use the EXECUTION_INFO table in an event sourcing-like append-only manner to record all statuses and the times at which they were generated within Cromwell. A database view could present an image of all this that looked the same as the EXECUTION table does now, except we'd have conflict-free writes, simplified Cromwell internals, and event data in the DB that ops currently has to scour logs to find. But the first paragraph here is a prerequisite for all this and a good place to start.