broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
996 stars 361 forks source link

default_runtime_attributes should be a MaterializeWorkflowDescriptorActor thing #1076

Closed cjllanwarne closed 8 years ago

cjllanwarne commented 8 years ago

The assignment of these variables into each task's descriptor naturally fits in MaterializeWorkflowDescriptorActor rather than relying on the backends to re-implement it.

Re-enable test "assign default runtime attributes" in MaterializeWorkflowDescriptorActorSpec

kcibul commented 8 years ago

Good refactor to reduce future headaches/bugs/effort

mcovarr commented 8 years ago

I picked up this ticket mostly to avoid getting in other people's way, but having looked at if for a while I'm not seeing much value in this. The default_runtime_attributes feature itself is valuable and working, with Centaur test coverage. The backends aren't really re-implementing default runtime attributes, nearly all the heavy lifting is being done by RuntimeAttributesDefault. The backends currently do have to be aware of the existence of the default runtime attributes feature but that doesn't really seem so bad.

Unassigning and returning to the bottom of the 0.21 pile, recommending for demotion to a lesser pile or outright closure.

mcovarr commented 8 years ago

Currently default runtime attributes are typechecked in the backend-specific runtime attribute classes (e.g. JesRuntimeAttributes, LocalRuntimeAttributes, etc) using attribute name to type mappings that are backend-specific. With our current static backend selection scheme, MWDA knows the backend to which a task will be sent at validation time. So while it's currently possible to refactor to expose backend-specific default runtime attribute typechecking to MWDA, that system would break down with a dynamic backend selection scheme.

It's also not clear how MWDA-composed runtime attributes would be handed to the backend-specific runtime attribute classes for the more substantive "beyond typechecking" round of validation and execution. It's possible we could copy the NamespaceWithWorkflow and write the relevant attributes into the tasks, but I'm not sure if we'd get into trouble later with bindings that no longer agree with the AST.

mcovarr commented 8 years ago

I think this is entangled with #1166, or at least that ticket might be solved by implementing this one.

ruchim commented 8 years ago

@mcovarr is this ticket still applicable?

mcovarr commented 8 years ago

Yes it is, I didn't get to this at all. 😢

mcovarr commented 8 years ago

Per standup this morning we've decided to close this. The requested functionality is problematic on a few levels:

JSON runtime attributes are currently interpreted as WdlExpressions after going through coercion, which presently is a backend-dependent process as only the backend knows the accepted runtime attribute coercions. While currently it might appear safe to treat the default runtime attribute JSON values as WDL with a WdlExpression.fromString, that's an assumption of WDL / JSON expression equivalence not made elsewhere in Cromwell and it seems questionable to pioneer that here.

Currently Cromwell's backend assignments happen at initialization time, but in the future Cromwell's backend dispatch is likely to become more sophisticated and dynamic. It therefore wouldn't be possible to say at workflow initialization time the backend to which a call was fated, which would mean the default runtime attribute handling would need to happen in the various FooRuntimeAttributes classes as it does now. The refactor proposed here would actually remove the default runtime attribute handling at call execution time and therefore make dynamic dispatch more difficult to add in the future.

Finally, while it appears technically possible to doctor tasks with workflow option-derived default runtime attributes in MaterializeWorkflowDescriptorActor, this makes for some pretty hacky code. I discovered at least 3 spots that needed to be updated:

That last item was particularly hideous since Workflow#children is explicitly write-once. I got around this with subclassing, but I felt bad about myself afterward.