unifhy-org / unifhy

A Unified Framework for Hydrology
https://unifhy-org.github.io/unifhy
BSD 3-Clause "New" or "Revised" License
11 stars 5 forks source link

Implement functionality to record outputs in CF-NC file #13

Closed ThibHlln closed 3 years ago

ThibHlln commented 3 years ago

resolve #4

The API of Component now features an optional argument outputs that allows the user to specify which component variables should be outputted in a CF-compliant netCDF file. These variables can be component states, component transfers, and/or component specific outputs (i.e. those not already part of the interface of the frameworks).

This functionality allows to request for each variable as many outputting frequencies (using datetime.timedelta to specify the frequency, as long as it is a multiple integer of the component temporal resolution) and as many aggregation methods as desired (providing a sequence of strings, options are: mean/average, sum/cumulative, point/instantaneous, minimum/min, maximum/max) - see example below for the data structure expected for the outputs argument of Component.__init__().

from datetime import timedelta

outputs = {
    'output_a': {
        timedelta(days=1): ['mean', 'sum'],
        timedelta(hours=1): ['min']
    },
    'state_a': {
        timedelta(hours=12): ['point']
    },
    'transfer_a': {
        timedelta(weeks=1): ['mean']
    }
}

This functionality does not allow for frequencies that are not multiple integers of the component's temporal resolution (i.e. no time interpolation). Moreover, this functionality does not allow to output for a sub-period of the component's TimeDomain only, and does not allow for the output to start on a specific time step: for a given output, the first value written in file will be for (simulation start)+(output frequency), and will carry on until (simulation end) is reached - depending on the pair {(simulation period),(output frequency)} there may or may not be an output for the last simulation time step.

For each component, the number of output files produced depends on the number of different frequencies requested (referred to as 'output streams' in the code): for a given component, all outputs featuring the same outputting frequency will be stored in the same file. The number of output variables in a given stream file depends on the number of outputs and the number of aggregation methods for these outputs. The variable names are taken as (output name)_(output method). This is to allow for the distinction between variables if more than one output method was requested for a given output at a given frequency. Note, even if only one method is requested, the variable name formatting remains unchanged (i.e. it features the output method regardless).