Closed kmdeck closed 7 months ago
I think the FileReader will ultimately only contain the path to the raw data, and then it will read it in, regrid it, when asked to by evaluate!. Does this mean that in the temporal case that regridding happens on the fly? The alternate is what we have now: regridding up front and storing those in separate files. If we do this, the FileReader will contain the path to the raw and regridded data. I think this is what we have implemented currently.
FileReader
will probably be more complex than just path to raw data. It will probably store in memory some regridded data, and yet an object Regridder
that knows how to do regridding. If possible, we should no go through additional I/O to do regridding, especially on GPU.
This sounds good to me!
I agree that storing spatially-varying parameters in the parameter struct seems cleaner. It would be nice to have all the parameters in one place.
I was thinking that it might be more clear to have a different function than evaluate!
for the spatially-varying case, since we won't be evaluating/updating over time. Maybe something like set!
would emphasize that we're setting values that won't change throughout the simulation.
This sounds good to me!
I agree that storing spatially-varying parameters in the parameter struct seems cleaner. It would be nice to have all the parameters in one place.
I was thinking that it might be more clear to have a different function than
evaluate!
for the spatially-varying case, since we won't be evaluating/updating over time. Maybe something likeset!
would emphasize that we're setting values that won't change throughout the simulation.
I like set!
! I will try that in the proof of concept
I think the FileReader will ultimately only contain the path to the raw data, and then it will read it in, regrid it, when asked to by evaluate!. Does this mean that in the temporal case that regridding happens on the fly? The alternate is what we have now: regridding up front and storing those in separate files. If we do this, the FileReader will contain the path to the raw and regridded data. I think this is what we have implemented currently.
FileReader
will probably be more complex than just path to raw data. It will probably store in memory some regridded data, and yet an objectRegridder
that knows how to do regridding. If possible, we should no go through additional I/O to do regridding, especially on GPU.
Im not sure it should hold the regridded data, because that is the thing we ultimately need. And that will live in the parameter struct as the parameter values. when I have something more concrete we can discuss!
implemented in ClimaUtilities: https://github.com/CliMA/ClimaUtilities.jl/pull/19
Is your feature request related to a problem? Please describe. The purpose of this is to introduce code which allows us to read in datasets of spatially varying parameters which do not vary in time and make use of them throughout the simulation. Time dependent spatially varying parameters are being addressed by @Sbozzolo and link to PR/Issue:
We currently have regridding tools implemented for reading in 2D data from a file and storing in the cache at the beginning of the simulation (the Bucket bare ground albedo). We also support site level runs where the parameters are simply scalars.
Requirements
A unified interface for handling site-level and global runs [surface parameters as scalars or 2d fields, or as 1d fields (depth resolved at a site) or 3d fields (depth resolved globally)].
We would follow the same approach as the TimeVaryingInput: an abstract type
AbstractSpaceVaryingInput
, a constructorSpaceVaryingInput
which has the same interface regardless of domain configuration, and concrete types of SpaceVaryingInput0D (scalar), SpaceVaryingInput2D, etc. We would implement these first and can follow on with the 1D, 3D, and analytic cases as needed. These types implicitly specify several things:Each concrete type would define a method of
evaluate!
which updates the values of the parameters (where these values are stored and where this update is called is discussed below). This would happen only once prior to the start of the simulation. NOTE: we may not need to call thisevaluate!
, i.e. we may not need to extend the function we have already forTimeVaryingInput
. Let's discuss if it is cleaner to use a different name and function.a
DataHandler
when needed (1D, 2D, 3D). In this case, this would be aFileReader
which reads in the data and handles regridding. Details of the regridding would be stored in theFileReader
object and theSpaceVaryingInput
would not need to know about it. In temporally varying cases, theDataHandler
would also use the sameFileReader
structs and methods, but also contain an object that specifies how and when to read in the data during the simulation. The difference is that in the temporally constant case, we only need theFileReader
because all the data will be read prior to the simulation start and because we do not need to update the values in time.I think the
FileReader
will ultimately only contain the path to the raw data, and then it will read it in, regrid it, when asked to byevaluate!
. Does this mean that in the temporal case that regridding happens on the fly? The alternate is what we have now: regridding up front and storing those in separate files. If we do this, theFileReader
will contain the path to the raw and regridded data. I think this is what we have implemented currently.Decision on where to store the spatially varying parameters.
The options are in the cache, and we set the parameters with
set_initial_cache
which already exists, or in the parameter struct for the model, and they get set with the constructor.The upside to the former is that we already have a
set_initial_cache
function we can work with. We could just addevaluate!
commands for all the parameters for that model. It might be nice to define a default which does this. The challenge to this is that then we need to add in parameters based on model type (and model parameterization type) in flexible way, which isnt hard to do (we do this already for other aspects of the model, like prognostic variables), but requires more code changes. The cache would then have a mix of spatially and temporally varying quantities, and globally constant params (e.g.g
) would be stored elsewhere.Alternatively, we can store the parameters in the Parameter struct itself for the model, along with the
earth_param_set
(global constant params and fundamental constants). This is more akin to what we are doing now and may be conceptually cleaner (parameters are stored in the parameter struct, and not part stored in the cache and part stored somewhere else). It also means that what is in the cache are things that get updated in time, with the exception of the dss_buffer :P and that's kind of nice. And, since the models already specify their parameters in the Parameter struct, we dont need to define a new way of adding these fields to the cache.I vote option 2.
Proposal