Open AMGold99 opened 2 years ago
We could randomly hold back a small portion (say 5 to 10%) of the data to use for additional model validation. Nolte (2020) doesn't do this, though that may be only because he's interested in validating the model against previously observed, publicly funded conservation purchases, which are explicitly excluded from the private transaction sales data used to the fit the model.
So, we use OOB MSE to compare among RF models and generate variable importance plots. Then, we use the preferred model to estimate the (logged) sales prices of the observations we held out of the RF analysis entirely, and calculate the R-squared.
I think the variable importance plot from the RF analysis will be useful, and will be complemented by reporting F-stats and standardized beta coefficients from the OLS model.
Sounds good. The OOB MSE is a neat measurement, as it (to the best of my understanding) assesses the performance of a forest by running observation x through all the trees in the forest that did not use observation x in their initial splitting process. In other words, I'm hopeful that, as you say, we can compare different models using the OOB MSE and R^2 without having to do the manual testing that more primitive random forest modeling seemed to require.
The ranger random forest model does provide an OOB estimate of error (the kind you would get by testing your model on a 'test' data subset), but I've been struggling to put to rest the question of model assessment. Unlike with classification models, we cannot present a confusion matrix.
Further, are we comfortable with having a variable importance plot as the rough equivalent of a coefficient/se table? (Variable importance plots/tables communicate how effective each variable is at reducing the impurity at any given split.)