NOAA-OWP / DMOD

Distributed Model on Demand infrastructure for OWP's Model as a Service
Other
7 stars 15 forks source link

Normalize path values in configuration data within context of job workers #407

Open robertbartel opened 1 year ago

robertbartel commented 1 year ago

Some configuration data applicable to various DMOD jobs involves paths. Examples include paths to other configuration files, BMI module shared libraries, data files/directories, etc.

Paths in configuration data need to be valid within the context of the job worker's file system. Because users are generally not expected to be aware of this file system structure, functionality must be added to DMOD so that it can normalize these path values.

The normalization must be done at some point between receiving the data from the user (via form, direct file upload, etc.), but the best place for this still needs to be determined. Two initial possibilities are:

There are also implications to consider from the ngen-config dependency, in how it validates paths (in particular when applicable types are used within the data service, which won't always have the backing config data loaded).

aaraney commented 7 months ago

This is blocked until the following are merged:

aaraney commented 5 months ago

This is also now blocking this work:

robertbartel commented 5 months ago

@aaraney, can you clarify the blocking relationship with https://github.com/NOAA-OWP/ngen-cal/issues/119; i.e.:

aaraney commented 5 months ago

This issue can be completed without https://github.com/NOAA-OWP/ngen-cal/issues/119. However, if https://github.com/NOAA-OWP/ngen-cal/issues/119 is introduced, changes will be required to DMOD source inclusive of the feature discussed in this issue.

robertbartel commented 3 months ago

@aaraney, in thinking about #637, #654, #593, and similar issues, I'm considering whether we need to adapt/extend dataset formats and data orchestration behavior in certain ways that will be significant to this issue. We might need to discuss some of this further.

For one, we may need to store certain data that comes in the form of a large number of individual files as archives for performance reasons. Also, we may need to have a way to truly download data locally on disk (i.e., of the actual host node) for containerized workers, perhaps to a simple Docker volume or something. These likely both have implications on how we set up paths in configs to find the data within the workers.

Archiving files before uploading to an object store (e.g., generated BMI init configs) is orders of magnitude faster, but means workers would need to extract things. Retrieving all data files all at once - even without archiving - also appears to be orders of magnitude faster, at least for forcings CSV, than retrieving files one at a time.