Open hkershaw-brown opened 1 month ago
WRF PHB is read from a wrfinput template file, but is PHB in every wrf file?
If so it is "Per ensemble member static data" that is equal for every ensemble member
Here's question that might influence our choices: is it reasonably easy to store some kinds of data distributed across a single node, which is essentially the tasks we request from each node? This would cut down on memory usage and not increase internode communication.
Here's a framework for thinking about names for the kinds of data filter needs to store and some possibilities to consider. Short and common usually wins over longer and more meaningful. (except when trying to sound impressive: "intercomparison", "irregardless", ...) I tried to think of short and meaningful descriptions. Combinations of 2 simple words can be useful.
First dimension: time varying;
no = metadata about grids, including surface and boundaries.
"static" in my/most vocabularies
yes = "evolving", "time varying" (pairs with member-varying, below)
due directly to assimilation:
"assimilated", "updated"
due indirectly to assimilation through the model forecast: (some is currently called "no-copy-back")
"not updated", "carried", "passive", "baggage",
Second dimension; within an ensemble:
no varying among members: (Helen has called: "static". could be made specific by "ensemble static")
"no spread", "ensemble constant", "member independent"
varying between members:
"updated"? (implies time varying too)
"member varying" "member dependent"
Third dimension: size. Mostly determines the importance of distributing it.
1D, 2D, 3D
I prefer leaving "prognostic" and "diagnostic" for classifying variables in models.
is it reasonably easy to store some kinds of data distributed across a single node, which is essentially the tasks we request from each node? This would cut down on memory usage and not increase internode communication.
Yes for sure it is "easy" - it is just counting things.
WRF PHB is read from a wrfinput template file, but is PHB in every wrf file? If so it is "Per ensemble member static data" that is equal for every ensemble member
As far as I know PHB (base state geopotential) is in every wrfinput file. It is static both across ensemble member and in time. It needs to be summed with the PH (perturbation geopotential) to provide actual geopotential.
There are several model_mods and core DART modules that have a fixed size memory requirement on each processor. The memory usage is static_mem* num_procs (does not scale as you add processors), and is a hard limit for the model size in DART.
Goal:
Rather than the current:
Note the code may need to be sensible about what static data is tiny (fine on every core) vs. large.
Static data in DART:
static data, same across the ensemble:
Per ensemble member static data: This gets put into the state at the moment, so is inflated (maybe should not be). An example (I think) is the CLM fields that are 'no-update' see #276
In addition (going as a separate issue), is observation sequence files which are on every core (and particularly for external forward operators which are in the obs sequence).