EyeofBeholder-NLeSC / knime-demo

This is for keeping files for demostrating the usage of knime.
Apache License 2.0
0 stars 0 forks source link

Notes of working with KNIME #2

Open jiqicn opened 2 years ago

jiqicn commented 2 years ago

This issue is collecting random ideas for designing the KNIME workflows in our project. so it might not be organized well (can be summarized as a separate document in the future if needed).

jiqicn commented 2 years ago

Pipeline Design

General principles

Different ways of modularization in KNIME

Metanodes

The simplest way to clean up messy workflows, and that is the only thing that metanodes can do: metanotes are purely used to organize your workflows better. Also note that metanodes created by users can't be uploaded to KNIME Hub.

Components

Components are nodes that contain a sub-workflow, which lets you bundle functionality for sharing and reusing. Components are similar to metanodes but with more functions.

To refer:

1. A component can encapsulate flow variables

Flow variables created inside components will not leave the component unless this is expressly set as an component output, and verse vice, flow variables created in the workflow but outside of the component will not enter the component, unless expressly set as an component input.

2. Component configuration through configuration window and nodes.

A component has a configuration window to setup itself. Also, this window can be customized to some extend by adding configuration notes.

3. Only a component is given a view (dashboard)

The configuration, widget, and visualization nods inside a components can be organized and displayed within a single view (dashboard) with interactions. Layout is also customizable.

4. Components can be uploaded to KNIME Hub for sharing

In this way, components can be published for collaboration and reusage. There will be a webpage of the component that can show descriptions of functionalities, input/output, and links to sample workflows that demonstrate the usage of the component.

Workflow Services

In KNIME, it's also possible to create workflow as a service that can be called by other workflows. This is done by using the KNIME workflow services set of nodes.

Different from using components which copys all the nodes inside and the configuration to the target workflow, a workflow service is merely referenced. The caller workflow gets to know what to call from the callee workflows, but the exact nodes that will be executed where it is located, and this is hidden to the caller workflow.

Comparing to components, workflow services has the following pros and cons:

Advantages

Disadvantages

To summarize, workflow services are good when splitting a big task into smaller sub tasks that can be performed individually (e.g. collecting data, preparing data, training machine learning models, comparing performance, etc.), while components are more like nodes (with configuration and view) that are highly modularized.

To refer:

jiqicn commented 2 years ago

KNIME Hub for sharing your work

It would be nice to have our work (workflows and components) to be uploaded to KNIME Hub. This is the recommended way of sharing works and collaborating with others.

  1. After creating an KNIME user account, a user can create a workspace on KNIME Hub, either private or public.
  2. Workflows and components can be uploaded to that space for sharing with others.
  3. User accounts can be mounted to the local KNIME software. After that, users can operate workspaces inside the KNIME software.
  4. A workflow/component can be uploaded by dragging and dropping that workflow/component to the desired workspace.
  5. It's always recommended to well document the workflows/components by adding a description of its functionality, required inputs, and outputs.

To refer: https://docs.knime.com/latest/hub_user_guide/index.html#collaboration.

jiqicn commented 2 years ago

Best Practice

jiqicn commented 2 years ago

Component Design

It is best to build your shared component in a way that it behaves like you would expect a KNIME node to. This means it should be configurable via the configuration window, have an informative description, and give you meaningful error messages when something fails.