Jupyter-Kale / kale

Jupyter Interactive Workflows for High Performance Computing
Other
15 stars 3 forks source link

Kale parsl #49

Open OliverEvans96 opened 6 years ago

OliverEvans96 commented 6 years ago

Workflows can be executed via Parsl, and Parsl can be used to create workflows. This only works for PythonFunctionTasks so far.

Some changes have almost certainly broken Fireworks execution, so this PR is not ready to be merged, but I thought it would be good to open to compare with master.

See kale/examples/notebooks/Kale-Parsl.ipynb for explanation.

OliverEvans96 commented 6 years ago

I just rebased onto most recent changes to pre-parsl: b28da39 "Move nb_task to kale"

OliverEvans96 commented 6 years ago

I found an error in d7d46d6 Create separate function to add workflow dependencies, so I added a5e6baa Fix error in add_dependencies and rebased kale-parsl onto it.

OliverEvans96 commented 6 years ago

Tasks, for instance, are still bound to Fireworks. We need to detangle those implicit bindings.

What do you mean by that?

mlhenderson commented 6 years ago

Tasks are tied to Fireworks. Ideally we want a clean Task base class that does not rely on fireworks or parsl at all, and then a FireworksTask and a ParslTask. The behaviors should be separate. This is part of making this work extensible for new workflow systems that we did not explicitly code for.

I can cite some code lines where Task implies a Fireworks binding must exist.

https://github.com/Jupyter-Kale/kale/blob/kale-parsl/kale/workflow_objects.py#L628 https://github.com/Jupyter-Kale/kale/blob/kale-parsl/kale/workflow_objects.py#L639-L666

mlhenderson commented 6 years ago

I don't think you want these in the tree: https://github.com/Jupyter-Kale/kale/tree/kale-parsl/kale/examples/notebooks/fireworks/.ipynb_checkpoints

OliverEvans96 commented 6 years ago

I see what you mean - thanks for the explanation!

So presently we have a Task base class, then subclasses for BatchTask, CommandLineTask, PythonFunctionTask, NotebookTask. I imagine we don't want a Parsl/Fireworks version of each of those. Or maybe we do, but then it's necessary to define 4 different tasks for each new WF executor.

We could just have ParslTasks, FireworkTasks, and then task_type (which takes values of 'PythonFunctionTask', etc.) just be an attribute on the task object. But then we have lots of extra attributes that are only relevant to one type of task.

I think what might be more useful is just to pull all of the execution functions off of the Task objects and just have them in separate modules, designed to take the task as an argument. That way, we don't have kale.workflow_objects.Task._gen_firetask(), but rather kale.fireworks.gen_firetask(task), for example.

Do you have any thoughts on this aspect of the organization?

mlhenderson commented 6 years ago

So there is a fundamental change coming that is going to reorient some of these classes. We are going to have the workflow itself defined using a 'native' workflow description (e.g., Fireworks, Parsl, others), and then that system will be responsible for execution. We'll have our own hooks in there, but we want to leverage and defer to the underlying system where feasible.

Task maps most closely to Firework, since that is what the workflow in Fireworks is composed of. We could have something called Node instead, which would be the same concept. Its a unit of work either way. In Parsl, Task seems to be most similar to App, but Node might be more appropriate as a name. The issue in parsl though is that compared to Fireworks, you can have another control layer that is not explicitly part of the workflow. See, for example, https://parsl.readthedocs.io/en/latest/userguide/workflow.html#parallel-dataflows. Do we define an artificial node to encapsulate that?

Since each system will have some its own uniqueness, and I think so far at least Parsl seems to be quite different from Fireworks and other systems, we should be viewing each system as having its own entire set of components in Kale. We should have a core abstract set of components that will map generically to common elements of a workflow system, and then allow each system to overload and extend functionality to achieve a particular feature set. We want to keep this as simple as possible at first.

I don't think kale.fireworks.gen_firetask(task) will solve the problem, because you don't want to pass a parsl task into gen_firetask().

The simplest solution, for now, is probably to use either Task or Node as a generic base class, then subclass that for Fireworks and Parsl as a general Task/Node object. The other types (Batch, CommandLine, Python, Notebook) are more about a specific view. That could be implemented as a second class interface with multiple inheritance. The thing about using a lot of inheritance though is it can be brittle, meaning you want the base classes to be as stable as possible (minimize changes) to not disrupt the inheritance chain.

So it might look something like:

class FireworksNodeView(NodeView):
    ...

class FireworksNode(Node, FireworksNodeView):
    ...

class BatchNodeView(FireworksNodeView):
    ...

class FireworksBatchNode(FireworksNode, BatchNodeView):
    def __init__(self, ...):
        ...
    <FireworksNode overloaded methods>
    ...
    <BatchViewNode overloaded methods>
    ...
    <local methods>

What is the difference between Node and NodeView? NodeView has attributes and methods that correspond to the display of a Node. Node has attributes and methods that correspond to a workflow and its execution. You may not need to overload BatchNodeView for Fireworks and Parsl, maybe just inherit from BatchNodeView, but if you need to overload you can.