Closed bnuzyc91 closed 3 years ago
Testing error will almost always be smaller than out-of-sample error. For example in least squares, training error is goodness of fit. Out-of-sample error is performance on test data set. Obviously the first will be smaller, because you are testing on the same data you are training. This is machine learning 101.
I notice that in my application, I trained the data on the training dataset (DS) and get a training error rate. When I apply the trained model to a testing DS. The testing error rate sometimes can be smaller than training OOB error rate.
I got an impression from my other modeling experience that the model should perform better on the training DS than testing DS.
So could you help understand why the testing error rate can be smaller than training OOB error rate?