Random forest modelling understanding

nhs-bnssg-analytics / d_and_c

Scoping the possibility of predicting performance from demand and capacity metrics

1 stars 0 forks source link

This comment is related to item 2 in the checklist above

The above image is the distribution of the target variable over time. It increases year on year. That could mean that RF is inappropriate for predicting future performance if it keeps on increasing?

When predicting 2019 data, this chart shows observed versus expected when using 2017 and 2018 input data:

Using the same data, but randomly splitting it (eg, ignoring the year of the data), the same chart is as follows:

For the second scenario, the RF model is underpredicting at high values and overpredicting at low values.

nhs-bnssg-analytics / d_and_c

Random forest modelling understanding #22