Open fabian-s opened 8 years ago
The same problem occurs for lf.vd()
terms. I want to use a model with variable-domain covariate for binary response. To asses prediciton accuracy, out-of-bag prediciton is inevitable.
library(refund)
data(sofa)
fit.vd1 <- pfr(death ~ lf.vd(SOFA) + age + los, family="binomial", data=sofa)
pred <- predict(fit.vd1, newdata = sofa)
# Error in eval(expr, envir, enclos) : object 'SOFA.arg' not found
A workaround is to use weights
:
## fit the model using weights
train_ind <- sample(0:1, size = nrow(sofa), replace=TRUE)
fit_train <- pfr(death ~ lf.vd(SOFA) + age + los, family="binomial", data=sofa,
weights = train_ind)
## only keep the predictions with weight 0
pred_oob <- predict(fit_train, type = "response")[train_ind == 0]
But this is rather tedious... And I am not sure, how the data with weight 0 enter the model anyway. Consider the following model fit where the training data is used instead of using weights
. Thus, the models fit_train
and fit_train_data
should be equivalent.
## compare the model fit with weights to the model fit on the training data only
train_data <- sofa[train_ind == 1, ]
fit_train_data <- pfr(death ~ lf.vd(SOFA) + age + los, family="binomial", data=train_data)
## the two models should be equivalent, but e.g. the means differ
fit_train$pfr$datameans
fit_train_data$pfr$datameans
@jgellar : sorry to keep filing bugs against your code, but not being able to generate predictions really sucks.... something like this may help