Closed Peter9192 closed 1 year ago
A test data is available in pyPhenology package. This can be also used as example data for this workflow.
A test data is available in pyPhenology package. This can be also used as example data for this workflow.
Cool! Let's use that as a first implementation of the above prototype use case
Mixed-effects random forest https://manifoldai.github.io/merf/
Building further upon https://github.com/phenology/springtime/pull/28
For ease of development it would be helpful to have a Python script or notebook that implements a basic workflow that we would like to further develop. Let's start in this issue with listing the steps. This could be a starting point for discussion:
Load target data from remote sensing and in situ observations 1.1 In situ from http://plantphenology.org/, see #2
Load predictor variables from weather datasets 2.1 Remote sensing from #3 2.2 Daymet: ... 2.2 ...
Preprocesses the data 3.1 Derive additional predictor features: growing degree days, ... 3.2 Extract features of interest from RS data: ...
Define train/test strategy
First apply a basic/benchmark/reference model 5.1 Pick one from https://pyphenology.readthedocs.io/en/master/models.html 5.2 Use the spring index model (re-use RS-Dat)?
Same workflow but instead of fitting a physical/empirical model we use a ML method 6.1 Simple sklearn model 6.2 Mixed-effects random forest 6.3 Explainable boosting machines 6.4 Mixed-effects explainable boosting machines
Compare the performance of the ML model to the physical model 7.1 What kind of scores?