SolidLabResearch / Challenges

24 stars 0 forks source link

A spec/ontology to describe the workflows used by the orchestrator #51

Open ajuvercr opened 2 years ago

ajuvercr commented 2 years ago

Create onthology for components in a workflow.

Pitch

A workflow can only be as powerful as the combined power of its steps. To not .... the power of a workflow, it is paramount that steps can be implemented in any way, shape or form (different programming languages, one or many machines etc). To make this possible an architecture (coined connector architecture) was introduced, where each step corresponds to a processor, and processors communicate via channels.

Existing frameworks for workflow management, such as NiFi, Oozie, Airflow, and Dagster restricts the users within the context of the frameworks, be it in terms of programming language, limited API extensibility or fixed orchestration mechanism. On the other hand, DSL based workflow management tools such as Toil and Snakemake are limited in the tasks that they support which includes only BASH scripts.

Nextflow solved the aforementioned problems of the workflow management systems, however, it only supports file-based channels for data transfer. It cannot set up a workflow with processors using arbitrary channels such as Kafka for data transfer.

Optionally: A processor can be viewed as a runner. A runner abstracts the connector architecture away from the end user but requires some configuration. An example runner can be a JsRunner that takes a path to a source file and a function name and boots up the required channels and passes them to the expected function.

Desired solution

Acceptance criteria

Pointers

Examples of configuration (not linked data) can be found here https://github.com/ajuvercr/nautirust-configs

Scenarios

Manage workflows to derive/provide data to solid pods https://github.com/SolidLabResearch/Challenges/issues/50

RubenVerborgh commented 2 years ago

Can we apply this to a concrete use case, and have a specific demo, such that the challenge has a clear end?

pheyvaer commented 2 years ago

@ajuvercr Did you already have a chance to look into making the requested changes?

ajuvercr commented 2 years ago

It is difficult to give a demo about an ontology. On the other hand challenges like #50 require this challenge, which makes this challenge not really a SolidLabs Challege?

pheyvaer commented 2 years ago

Yeah fair point. @RubenVerborgh What do you think?

RubenVerborgh commented 2 years ago

I think there should be one or a couple of example workflows that can be described by (version 1 of) this ontology; this will give us an idea of what the ontology requires.

So basically, I am looking for any reasonable/representative workflow that, if the ontology can describe it, we can call the ontology a success.

pheyvaer commented 1 year ago

@ajuvercr Can you implement the necessary changes?

pheyvaer commented 1 year ago

@ajuvercr Do you think you can add what Ruben suggested?