Model assessment/testing and presentation

AMGold99 / ricardian

Ricardian land value paper (Gold, Binder, Nolte). For soil processing pipeline, see AMGold99/ssurgo-soil repo

0 stars 0 forks source link

Model assessment/testing and presentation #3

Open AMGold99 opened 2 years ago

AMGold99 commented 2 years ago

The ranger random forest model does provide an OOB estimate of error (the kind you would get by testing your model on a 'test' data subset), but I've been struggling to put to rest the question of model assessment. Unlike with classification models, we cannot present a confusion matrix.

Further, are we comfortable with having a variable importance plot as the rough equivalent of a coefficient/se table? (Variable importance plots/tables communicate how effective each variable is at reducing the impurity at any given split.)

[ ] How should we assess the RF model's performance?
[ ] Is a variable importance plot sufficient for communicating relative explanatory power of land value determinants?

binders1 commented 2 years ago

We could randomly hold back a small portion (say 5 to 10%) of the data to use for additional model validation. Nolte (2020) doesn't do this, though that may be only because he's interested in validating the model against previously observed, publicly funded conservation purchases, which are explicitly excluded from the private transaction sales data used to the fit the model.

So, we use OOB MSE to compare among RF models and generate variable importance plots. Then, we use the preferred model to estimate the (logged) sales prices of the observations we held out of the RF analysis entirely, and calculate the R-squared.

binders1 commented 2 years ago

I think the variable importance plot from the RF analysis will be useful, and will be complemented by reporting F-stats and standardized beta coefficients from the OLS model.

AMGold99 commented 2 years ago

Sounds good. The OOB MSE is a neat measurement, as it (to the best of my understanding) assesses the performance of a forest by running observation x through all the trees in the forest that did not use observation x in their initial splitting process. In other words, I'm hopeful that, as you say, we can compare different models using the OOB MSE and R^2 without having to do the manual testing that more primitive random forest modeling seemed to require.