sarit-si / docker-airflow-pdi-02

Setup Airflow & Pentaho (without Carte) in same Docker container
5 stars 1 forks source link
airflow apache-airflow docker docker-compose pentaho pentaho-data-integration pentahodataintegration

Description

Step by step approach to easily dockerize Airflow and Pentaho Data Integration IN SAME CONTAINER. Below is the high level architecture of the image setup:

Pre-requisites

Versions

Environment variables, files & folders for container

Build & Deploy

Below command will build (if first time) and start all the services.

    docker-compose up

To run as daemon, add -d option.

Web UI

Airflow Webserver

    localhost:8080/home

How to trigger tasks from a DAG

Since there is no Carte server, the tasks will be executed by running PDI via kitchen.sh/pan.sh directly. Tasks are run in the same container as that of Airflow.

Job trigger:

    job = BashOperator(
            task_id='Trigger_Job',
            bash_command='/opt/airflow/data-integration/kitchen.sh -file:/opt/airflow/ktrs/helloworld/helloworld-job.kjb'
    )

Transformation trigger:

    trans = BashOperator(
            task_id='Trigger_Transformation',
            bash_command='/opt/airflow/data-integration/pan.sh -file:/opt/airflow/ktrs/helloworld/helloworld-trans.ktr'
    )

Best practices