topepo / caret

caret (Classification And Regression Training) R package that contains misc functions for training and plotting classification and regression models
http://topepo.github.io/caret/index.html
1.61k stars 634 forks source link

Heldout predictions are different #1336

Open suzannejin opened 1 year ago

suzannejin commented 1 year ago

Hello, I am training a regression model with ridge regression and self-defined samples. The same heldout data are used twice, so each model is outputting the predictions for each data twice. I would expect that the predictions obtained for the same heldout data to be the same, yet they are slightly different.

Below you have a screenshoot, where the same rowIndex refer to the same heldout data, and the corresponding predictions.

image

In the manual, I found the following:

"For particular model, a grid of parameters (if any) is created and the model is trained on slightly different data for each candidate combination of tuning parameters. "

I suspect that the observed slightly different predictions may be related to this. My question here is do you modify the data before training/prediction? If so, what is the purpose?