ImperialCollegeLondon / pyrealm

Development of the pyrealm package, providing an integrated toolbox for modelling plant productivity, growth and demography using Python.
https://pyrealm.readthedocs.io/
MIT License
23 stars 9 forks source link

Allow FastSlowModel to take initial realised values #82

Closed davidorme closed 2 weeks ago

davidorme commented 1 year ago

Is your feature request related to a problem? Please describe.

When a FastSlowModel is fitted, the first 'realised' values are equal to the optimal value at time $t=0$ and thereafter the memory effect is used to impose a lag. If you break up a time series into chunks, then each chunk will 'reset' to the optimal value at the start of the chunk, so there is a brief period where a memory reset happens.

Describe the solution you'd like

The FastSlowPModel could accept a dictionary of realised $V{cmax}$, $J{max}$ and $\xi$ values, which are used instead of the optimal value at $t=0$. This would allow a time series to maintain a continuous memory effect, even if the models are fitted in chunks. The class would also provide a method to optionally extract that dictionary from a fitted FastSlowPModel.

You would still need to run the time chunks in series (not in parallel), to be able to feed those outputs into the next time chunk.

surbhigoel77 commented 8 months ago

Current Realized value of the slow parameters is calculated as:

R_{t} = R_{t-1}(1 - \alpha) + O_{t} \alpha where,

R{t-1} = Realized value at previous timestep O{t} = Optimal value alpha = control parameter

Optimal value is the ideal value at a time step whereas realised value is the true behaviour of the plant.

In current setup, for t=0, the realized value is equal to O_{t} as we do not have a provision to fetch the lagged value for first step of the timeseries chunk. To resolve this, we can build a dictionary of the slow response variables $V{cmax}$, $J{max}$ and $\xi$ so that lagged values can be fetched when required, without any breaks throughout the timeseries processing.

Refer to https://pyrealm.readthedocs.io/en/latest/users/pmodel/subdaily_details/subdaily_overview.html for more details

davidorme commented 8 months ago

We just need to be able to pass those initial values in as arguments somehow. They could be three arguments (init_vcmax25, init_jmax25 and init_xi - or something like that) or it could a be a dictionary with those three keys. Users should have to provide all three if they are going to use it, so it seems easier to have a single argument. The provided values need to match the dimensions across the first dimension of the provided PModelEnvironment (that is they need to provide a single time slice).

The obvious test would be to run a model in two time chunks and it should then be identical to running it one go.

surbhigoel77 commented 8 months ago

Self Note:

  1. An acclimation window is defined which is assumed to have the most optimal set of environmental conditions for a high plant productivity. There are 3 differet ways to get the acclimation window.
  2. These optimal env conditions/forcing variables (tc,patm, vpd, co2) are then used to calculate the fast-response variables for a subdaily timeseries.
  3. The slow variables are then calculated over the acclimation window using the memory effect R_{t} = R_{t-1}(1 - \alpha) + O_{t} \alpha. We start with R_{t} = O_{t}, where O_{t} is the optimal value calculated using the optimal environmental conditions. And then keep on passing the effect to the next future value.
  4. Then we get the slow varaibles value over the acclimation window and an average of these values is then interpolated back to the subdaily series.

Ques: From where are we getting the R_{t-1} component On which data, do we calculate the acclimation windiw, e.g. if we want to check the behaviour of the fast, slow variables at 10AM, how do we start the calculations?

davidorme commented 8 months ago

3. The slow variables are then calculated over the acclimation window using the memory effect

Not quite. The acclimation window is simply used to take a sample from the fast data and the averages within that window are used to calculate $O_t$ for each of the three acclimating variables. That gives a daily time series of $O_t$ for $t = 1 .. n$ and the memory effect is applied across the daily values to give $R_t$.

So, in the scenario we want to support, someone has run a previous time series and ended up with realised daily values. We want to be able to take the last set of daily realised values $R_n$ and feed those into a new run (the next year for example) as $R_1$.

surbhigoel77 commented 8 months ago

@davidorme do we want the [init_vmax,init_jamx,init_xi] as input from the user, or do we want the model to save the last_realised_values every time the model is run and pick these as initial values next time?

davidorme commented 8 months ago

It would be neat if a user could feed a previous SubdailyModel into that argument and have the last values get used automatically, but as a first step I think having init_realised: tuple[NDArray, NDArray, NDArray] = None is probably the way to go. We could then add an alternate input type as an additional PR.