buda-base / ao-workflows

Use DAG platform to define and orchestrate workflows
0 stars 0 forks source link

Reconcile docker file system with dip-pump #18

Closed jimk-bdrc closed 5 months ago

jimk-bdrc commented 5 months ago

The airflow under docker writes to a bind mounted volume that results in a path that the dip-pump cannot see.

The dip_log event is generated from a shell, running in the docker image, that has a file path that is different from the native host. In the docker compose file, the volumes are mounted:

      - ${ARCH_ROOT:-/mnt}/Archive0:/home/airflow/extern/Archive0
      - ${ARCH_ROOT:-/mnt}/Archive1:/home/airflow/extern/Archive1
      - ${ARCH_ROOT:-/mnt}/Archive2:/home/airflow/extern/Archive2
      - ${ARCH_ROOT:-/mnt}/Archive3:/home/airflow/extern/Archive3

left of the colon is the host (local) path, right of the colon is the local mount point.

this results in a dip_log dip_dest_path of /home/airflow/extern/Archive0 for the resulting record. this means that all dip_log work will not be able to locate that path (unless shims are made on sattva (ln -s /mnt/Archive0 /home/airflow/extern/Archive0

or the path is duplicated precisely on the docker image. This is the first path to explore:

      - ${ARCH_ROOT:-/mnt}/Archive0:/mnt/Archive0
      - ${ARCH_ROOT:-/mnt}/Archive1:/mnt/Archive1
      - ${ARCH_ROOT:-/mnt}/Archive2:/mnt/Archive2
      - ${ARCH_ROOT:-/mnt}/Archive3:/mnt/Archive3

bdrc-docker.sh can do this, in the same way it creates other resources. ONLY if that fails, use a shim on the client hosts (which I really don't want to have to support on two machines.!)

Fix is simpler:

      - ${ARCH_ROOT:-/mnt}:/mnt

And reflect changes in DAG, Dockerfile-bdrc, and bdrc-docker-compose.yml

jimk-bdrc commented 5 months ago

Fixed in 76a59d