VizierDB / vizier-scala

The Vizier kernel-free notebook programming environment
Other
35 stars 11 forks source link

Export workflows as functions #294

Open okennedy opened 7 months ago

okennedy commented 7 months ago

What pain point is this feature intended to address? Please describe. Workflows can get repetitive, with the same sequence of cells getting invoked multiple times. This can make workflows harder to follow, and limits composibility.

Describe the solution you'd like Allow one cell to invoke a workflow:

Specifically:

Describe alternatives you've considered

  1. Repetition is the current approach. This is potentially problematic if a bug is found in the repeating code, as each individual cell needs to be re-written. It is also cumbersome and makes notebooks longer
  2. Another approach is to allow users to hide portions of the workflow. This addresses issues of encumberance and size, but not repetition.
okennedy commented 7 months ago

A few open questions:

  1. Do we want to run the full workflow, or only relevant dependencies?
    • Potential approaches
      • Can we precompute the relevant dependencies?
      • Do we dynamically re-run the entire workflow?
      • Do we ask the user to tell us what to add to the list?
    • Thoughts
      • What seems logical is providing an interface through which the user can specify a subset of the workflow for export. See below for a thought.

Publish workflow image

The idea

okennedy commented 7 months ago

Schema:

okennedy commented 7 months ago

Loosely, the idea is:

  1. Add an editor that lets you select a subset of the workflow to publish.
  2. This editor would allow you to replace 'parameter' and 'load dataset' cells with 'interface' cells that import/export artifacts between the invoking workflow and the invoked workflow.
  3. Once published, the workflow would be 'standalone'. It wouldn't need the original workflow itself, but would retain a link that would let the user easily bring the published workflow up to a newer version, if one exists.
  4. Invoking cells would identify the published workflow by a global id and a version. This would allow us to avoid invalidating past invocations if a new version is published.
okennedy commented 6 months ago

image