truevoid-development / dagster-composable-graphs

Library to create Dagster jobs from YAML
https://docs.truevoid.dev
Apache License 2.0
1 stars 0 forks source link
dagster graphs orchestration python yaml

Dagster Composable Graphs

Dagster is a cloud-native data pipeline orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability.

This library provides additional functionality to define dagster jobs from a file in .yaml format and may be used with any other package that integrates with dagster.

Visit the documentation here.

Partially inspired by post Abstracting Pipelines for Analysts with a YAML DSL on the dagster blog.

Example

Consider the following definition of a ComposableGraph. Notice in particular sections inputs, operations and dependencies. Respectively these define the graph inputs, which dagster ops or graphs are part of the job, and their dependencies.

apiVersion: truevoid.dev/v1alpha1
kind: ComposableGraph
metadata:
  name: concatenate-graphs
spec:
  inputs:
    x: 2.0
    y: 5.0
  operations:
    - name: add_and_multiply
      function: example.jobs.add_and_multiply
    - name: add
      function: example.jobs.add
    - name: multiply
      function: example.jobs.multiply
  dependencies:
    - name: add_and_multiply
      inputs:
        - x
        - y
    - name: add
      inputs:
        - node: add_and_multiply
          pointer: /add
        - x
    - name: multiply
      inputs:
        - y
        - node: add_and_multiply
          pointer: /multiply

Results in the following job, visualized using dagster webserver UI: