PMCC-BioinformaticsCore / janis-core

Core python modules for Janis Pipeline workflow assistant
GNU General Public License v3.0
4 stars 9 forks source link

Default workflow #31

Closed drtconway closed 4 years ago

drtconway commented 4 years ago

Hi Janis,

It would make it easier to package up workflows if there was a way, either explicitly, or by convention of allowing the specification of a default workflow/tool in a file containing multiple workflows/tools.

Some possibilities include:

Thinking about how it might work in practice, the following seems like a reasonable way to keep it modular, so that importing things down the track works neatly:


class MyHelperTool(PythonTool):
    ...

class Phase1(Workflow):
    ...

class Phase2(Workflow):
    ...

class MyExcellentWorkflow(Workflow):
    ...

class MainWorkflow(MyExcellentWorkflow, DefaultWorkflow):
    # May need to call init on the superclass, I guess
    pass

Thus a client that wants to use MyExcellentWorkflow as a component can import it, without inheriting the defaultness. Note that the usage pattern is not specific to the DefaultWorkflow way of specifying which thing is the default.

illusional commented 4 years ago

Hey @drtconway, from your previous similar issue, I don't think I've handled this process very well in the past. I've come up with the following new process in https://github.com/PMCC-BioinformaticsCore/janis-assistant/pull/30, which follows the procedure:

  1. If a name is defined:
    • Force parse every token with a case-insensitive match
    • If a single item is returned from a case-sensitive match, then use that
  2. If multiple workflows are defined in the same file, use the last defined workflow
    • This covers the existing If a single workflow is defined, use that case
  3. If no tools were found, raise an Exception
  4. If multiple tools are defined in the file, use the last one:
    • If a name was defined, warn the user that the case-insensitive match returned no results and we'll use the last one
    • Otherwise, just tell the user we'll use the last defined tool

What do you think?

drtconway commented 4 years ago

Using the last defined tool isn't too bad, but it does mean that rearranging code can change the semantics. I like the idea of explicitly marking a tool/workflow as the default one because it means that cosmetic changes to my code don't change the semantics. It's not something to die in a ditch over though.

Could a decorator come to the rescue?

@default
class MyExcellentWorkflow(Workflow):
    ...

On reflection, the explicit tagging could actually be done in a complementary and orthogonal way to the rules you give above, though if you had multiple workflows tagged as default, you should probably warn.

illusional commented 4 years ago

My main concern with annotations is that they're sticky through imports, so:

from registry import MyExcellentWorkflow # myexcellentworkflow is tagged with @default

@default
class MyOtherAwesomeWorkflow(Workflow):
    pass

Janis sees two default workflows, as it can't discern between tokens imported or defined. Unless you wanted to do something like this:

class MyAwesomeWorkflow(Workflow):
    ...

@default
class _MyAwesomeWorkflowDontImportThisOne(Workflow)
    ...

But I think that's less elegant.

You're right, it can work in complement to that in the PR, so I'll merge that for now while we keep discussion this.

drtconway commented 4 years ago

I agree the sticky problem makes it less elegant, but I do still like the idea of explicitly marking entrypoint workflows.

It's not unlike Python's __main__ thing.

illusional commented 4 years ago

What if you exported a specific keyword from your workflow, something like:

__JANIS_ENTRYPOINT = MyAwesomeWorkflow
drtconway commented 4 years ago

That seems like a reasonable approach.

illusional commented 4 years ago

Hi Tom, as discussed this was implemented and released in Janis-assistant, as in my previous comment, you can export __JANIS_ENTRYPOINT, and Janis will use that UNLESS the --name parameter is specified:

__JANIS_ENTRYPOINT = MyAwesomeWorkflow