Closed thomasbrockmeier-ams closed 2 years ago
Containerizes postprocessing code in separate Dockerfile
Refactors postprocessing code, reducing dependencies
Minor fixes/additions
Changes styling pipeline to not build the entire environment (runs over 20x faster now)
Example usage:
postprocessing = KubernetesPodOperator( name="postprocessing", task_id="postprocessing", image="repository/myimages:0.1", image_pull_policy="Always", cmds=["python"], arguments=[ "postprocessing.py", "--input_path", "/tmp/coco_instances_results-3.json", "--output_path", "/tmp", "--permits_file", "/tmp/decos.xml", "--bridges_file", "/tmp/bridges.geojson", ], namespace="airflow", get_logs=True, volumes=[volume], volume_mounts=[volume_mount], startup_timeout_seconds=3600, )
N.B. SSH key forwarding is used in the Dockerfile to be able to pull from a private GitHub repo. Build the image using Docker build kit to use this functionality:
DOCKER_BUILDKIT=1 docker build --ssh default -f postprocessing.dockerfile -t myRegistry/myimages:0.1 .
Containerizes postprocessing code in separate Dockerfile
Refactors postprocessing code, reducing dependencies
Minor fixes/additions
Changes styling pipeline to not build the entire environment (runs over 20x faster now)
Example usage:
postprocessing = KubernetesPodOperator( name="postprocessing", task_id="postprocessing", image="repository/myimages:0.1", image_pull_policy="Always", cmds=["python"], arguments=[ "postprocessing.py", "--input_path", "/tmp/coco_instances_results-3.json", "--output_path", "/tmp", "--permits_file", "/tmp/decos.xml", "--bridges_file", "/tmp/bridges.geojson", ], namespace="airflow", get_logs=True, volumes=[volume], volume_mounts=[volume_mount], startup_timeout_seconds=3600, )
N.B. SSH key forwarding is used in the Dockerfile to be able to pull from a private GitHub repo. Build the image using Docker build kit to use this functionality:
DOCKER_BUILDKIT=1 docker build --ssh default -f postprocessing.dockerfile -t myRegistry/myimages:0.1 .
when you pass the keyword arguments in the pod operator, did you check whether it is working with commas as well? Chris's example was with equality signs, e.g. "--input_path=/tmp/coco_instances_results-3.json"
@epureanudiana, I tested the comma formatting on my machine. The DAG ran in Airflow with the code snippet above
The rest looks good!
The refactoring helps me as well. I still need to move register_dataset
and ExperimentConfig
from utils
, but I will check after we merge this branch
Containerizes postprocessing code in separate Dockerfile
Refactors postprocessing code, reducing dependencies
Minor fixes/additions
Changes styling pipeline to not build the entire environment (runs over 20x faster now)
Example usage:
N.B. SSH key forwarding is used in the Dockerfile to be able to pull from a private GitHub repo. Build the image using Docker build kit to use this functionality: