Computer-Vision-Team-Amsterdam / Detecting-Heavy-Objects

0 stars 0 forks source link

WIP: Bk 242/3 Postprocessing #15

Closed thomasbrockmeier-ams closed 2 years ago

thomasbrockmeier-ams commented 2 years ago

Containerizes postprocessing code in separate Dockerfile

Refactors postprocessing code, reducing dependencies

Minor fixes/additions

Changes styling pipeline to not build the entire environment (runs over 20x faster now)

Example usage:

    postprocessing = KubernetesPodOperator(
        name="postprocessing",
        task_id="postprocessing",
        image="repository/myimages:0.1",
        image_pull_policy="Always",
        cmds=["python"],
        arguments=[
            "postprocessing.py",
            "--input_path", "/tmp/coco_instances_results-3.json",
            "--output_path", "/tmp",
            "--permits_file", "/tmp/decos.xml",
            "--bridges_file", "/tmp/bridges.geojson",
        ],
        namespace="airflow",
        get_logs=True,
        volumes=[volume],
        volume_mounts=[volume_mount],
        startup_timeout_seconds=3600,
    )

N.B. SSH key forwarding is used in the Dockerfile to be able to pull from a private GitHub repo. Build the image using Docker build kit to use this functionality:

DOCKER_BUILDKIT=1 docker build --ssh default -f postprocessing.dockerfile -t myRegistry/myimages:0.1 .
epureanudiana commented 2 years ago

Containerizes postprocessing code in separate Dockerfile

Refactors postprocessing code, reducing dependencies

Minor fixes/additions

Changes styling pipeline to not build the entire environment (runs over 20x faster now)

Example usage:

    postprocessing = KubernetesPodOperator(
        name="postprocessing",
        task_id="postprocessing",
        image="repository/myimages:0.1",
        image_pull_policy="Always",
        cmds=["python"],
        arguments=[
            "postprocessing.py",
            "--input_path", "/tmp/coco_instances_results-3.json",
            "--output_path", "/tmp",
            "--permits_file", "/tmp/decos.xml",
            "--bridges_file", "/tmp/bridges.geojson",
        ],
        namespace="airflow",
        get_logs=True,
        volumes=[volume],
        volume_mounts=[volume_mount],
        startup_timeout_seconds=3600,
    )

N.B. SSH key forwarding is used in the Dockerfile to be able to pull from a private GitHub repo. Build the image using Docker build kit to use this functionality:

DOCKER_BUILDKIT=1 docker build --ssh default -f postprocessing.dockerfile -t myRegistry/myimages:0.1 .
epureanudiana commented 2 years ago

Containerizes postprocessing code in separate Dockerfile

Refactors postprocessing code, reducing dependencies

Minor fixes/additions

Changes styling pipeline to not build the entire environment (runs over 20x faster now)

Example usage:

    postprocessing = KubernetesPodOperator(
        name="postprocessing",
        task_id="postprocessing",
        image="repository/myimages:0.1",
        image_pull_policy="Always",
        cmds=["python"],
        arguments=[
            "postprocessing.py",
            "--input_path", "/tmp/coco_instances_results-3.json",
            "--output_path", "/tmp",
            "--permits_file", "/tmp/decos.xml",
            "--bridges_file", "/tmp/bridges.geojson",
        ],
        namespace="airflow",
        get_logs=True,
        volumes=[volume],
        volume_mounts=[volume_mount],
        startup_timeout_seconds=3600,
    )

N.B. SSH key forwarding is used in the Dockerfile to be able to pull from a private GitHub repo. Build the image using Docker build kit to use this functionality:

DOCKER_BUILDKIT=1 docker build --ssh default -f postprocessing.dockerfile -t myRegistry/myimages:0.1 .

when you pass the keyword arguments in the pod operator, did you check whether it is working with commas as well? Chris's example was with equality signs, e.g. "--input_path=/tmp/coco_instances_results-3.json"

thomasbrockmeier-ams commented 2 years ago

@epureanudiana, I tested the comma formatting on my machine. The DAG ran in Airflow with the code snippet above

epureanudiana commented 2 years ago

The rest looks good! The refactoring helps me as well. I still need to move register_dataset and ExperimentConfig from utils, but I will check after we merge this branch