azavea / noaa-hydro-data

NOAA Phase 2 Hydrological Data Processing
11 stars 3 forks source link

Workflows for Flood Inundation Mapping #124

Open jpolchlo opened 1 year ago

jpolchlo commented 1 year ago

Overview

Following on from #101, we are in need of an ability to run the flood inundation mapping code from NOAA OWP. This PR takes a swing at this objective. I'm providing some Argo Workflow examples that can use an EFS volume to mount the required data directly to the filesystem. This works provisionally (at least the mechanism works, even if the code doesn't run all the way to completion at the time of filing this PR).

I've tested the FR dataset. The GMS dataset requires different steps and may need a more complex workflow to run it.

Checklist

Notes

Testing Instructions

jpolchlo commented 1 year ago

I have added a workflow that runs to completion on the inundation workflow. This uses a combination of EFS data and EBS scratch space to perform the work, and the result gets synced up to S3 after completion. The most recent iteration also provides a Dockerfile which is a modification of this one which adds in the s3fs-fuse utility. This fuse plugin does not work because it doesn't understand IRSA, and requires access keys and secrets. There exists some recent effort to fix this, but it's not ready, nor is it necessary. It's sufficient to chain a few commands together to do the required transfer to s3 after the process completes.

The potential benefit of the FUSE plugin is to obviate the need for an EFS volume, which is an additional cost on top of the S3 storage, which can potentially fall out of sync. I'm contributing the s3fs-enabled docker image as a historical artifact that would potentially be useful, and possibly soon. The downside of taking this route, however, is that containers using FUSE need to run in privileged mode, which may open vectors for misbehavior. A topic for later debate, perhaps.

jpolchlo commented 1 year ago

Opening this up for review. I do need to document how to use this in a README, but the content is as good as it's going to get.