Open agstephens opened 2 years ago
Here is brief description of how we can build the first JASMIN EOEPCA Application.
We want to use daops
- which is currently a python library that includes subsetting capabilities.
We need to build a command-line tool for daops
, maybe looking a bit like this (using click
):
https://github.com/cedadev/kerchunk-tools/blob/main/kerchunk_tools/cli.py
Put the cli.py
file here: https://github.com/roocs/daops/tree/master/daops/
And edit the setup.py
file to tell python how to install it as a command-line entry-point, like:
https://github.com/cedadev/kerchunk-tools/blob/main/setup.py#L66-L70
Potentially, we want to be able to do:
$ daops subset [--area| -a <w>,<s>,<e>,<n>] [--time | -t <time_window>] [--time-components | -c <time_components>] [--level | -l] [--output-format | -f <format>] [--output-dir |] -d <output_directory><collection>
NOTE: output format is "netcdf", "nc", or "zarr".
Note that the rook
WPS gives us examples of input strings that we could utilise:
https://github.com/roocs/rook/blob/master/rook/processes/wps_subset.py#L30-L68
The code to wrap is the subset
function, here:
https://github.com/roocs/daops/blob/master/daops/ops/subset.py#L32-L43
You can ignore these arguments:
split_method
, file_namer
, apply_fixes
But set apply_fixes=False
in the call.
Also, add unit tests for the cli.py
, maybe a bit like:
https://github.com/cedadev/kerchunk-tools/blob/main/tests/test_cli.py
Once the daops application is fully working, create this file:
https://github.com/roocs/daops/blob/master/Dockerfile
An example of an existing application for EOEPCA Dockerfile is available here (for reference):
https://github.com/EOEPCA/app-snuggs/blob/main/Dockerfile
We have a cedadev
account, so can publish:
https://hub.docker.com/u/cedadev
An example EOEPCA application on Dockerhub is:
Create this file in Common Workflow Language (CWL) format:
https://github.com/roocs/daops/blob/master/app-package.cwl
Based upon this as a template:
https://github.com/EOEPCA/app-snuggs/blob/main/app-package.cwl
Follow the instructions at:
https://deployment-guide.docs.eoepca.org/current/eoepca/ades/#deploy-process
Details from proposal
We will create an example Application Package (as a Docker container), that discovers and stages-in data from the CCI Data Service. This data will be processed and the results staged-out to the User Workspace on the JASMIN object-store.
The proposed application package will be a containerized Python tool that provides temporal and spatial subsetting of ESA CCI data which is available through the existing ESA CCI Data Portal (catalogue). This will be developed on top of an existing processing framework developed to support climate simulations delivered through the Copernicus Climate Change Service (C3S) [https://roocs.github.io/overview/]. The main extension for this project will be support for additional datasets although the core functionality can work with any regular gridded CF-NetCDF data. The application package will be deployed through the ADES to make it available through the EOEPCA services.