ailinweili / FDboost

Boosting Functional Regression Models. The current release version can be found on CRAN (http://cran.r-project.org/package=FDboost).
0 stars 0 forks source link

check cross validation results of fpco-FDboost for DTI2 dataset #12

Open ailinweili opened 7 years ago

ailinweili commented 7 years ago
  1. minkowski-fpco-FDboost, cross validation over p = c(1,2,5,10), pve = c(0.7, 0.95), add = c(TRUE, FASLE), fastcmd = c(TRUE, FALSE) dti2_mink_msetrain dit2_mink_msevalid

Comment:

  1. elasticMetric-fpco-FDboost, cross validation over pve = c(0.7, 0.95), add = c(TRUE, FASLE), fastcmd = c(TRUE, FALSE) dit2_elastic_mse_train_valid

Comment:

  1. dtw-fpco-FDboost, cross validation over window.type = c("none", "itakura", "sakoechiba"), window.size = c(5, 10, 20) dit2_dtw_msetrain dit2_dtw_msevalid

Comment:

  1. correlation-fpco-FDboost, cross validation over pve = c(0.7, 0.95), add = c(TRUE, FASLE), fastcmd = c(TRUE, FALSE) dit2_cor_msetrainvalid

Comment:

fabian-s commented 7 years ago

only validation set errors are relevant for the evaluation. weird that almost none of the hyperparameters seem to make a big difference....

what are the validation set errors for a simple intercept model? you always need a baseline reference to evaluate the overall quality of the fits.

ailinweili commented 7 years ago

Hi, Fabian!

  1. The mse for intercept model is : unlist(mse.intercept) [1] 149.6382 145.4526 154.8933 165.4544 199.8487 145.7485 163.3762 179.4253 160.9315 123.6187

  2. I will improve the plots, thanks for the suggestion

  3. dtw takes more time, so I do not cv over add, fastcmd. Based on the results of other models, fastcmd may not be that important, but add is.

  4. weird that almost none of the hyperparameters seem to make a big difference. It also confuses me. I will write you a simple comparison soon

Best, Weili

2017-08-23 15:13 GMT+02:00 Fabian Scheipl notifications@github.com:

only validation set errors are relevant for the evaluation. weird that almost none of the hyperparameters seem to make a big difference....

what are the validation set errors for a simple intercept model? you always need a baseline reference to evaluate the overall quality of the fits.

  • you need better plots (i.e., they should at least all have the same y-axis for easy comparison, facetting / coloring according to fastcmd/add would be nice as well)
  • pve = 0.7 is too low -- use pve = .95, .99
  • Minkowski: you can leave out p = 10 in the future
  • DTW: for window.type = "none" it doesn't make any sense to vary window.size.....
  • why no add, fastcmd options for DTW?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ailinweili/FDboost/issues/12#issuecomment-324326318, or mute the thread https://github.com/notifications/unsubscribe-auth/ATmu9ceZhHN3LahQXKiuyh7M0VbpvETlks5sbCVpgaJpZM4O_7nt .

ailinweili commented 7 years ago

I use 1 cv fold to check the model performance of pfr ,fpc, fpco.dtw and intercept model. The results are shown at the end. The pfr model is copied from the example of DTI dataset for af function. Can you see any clues from the plot?

set.seed(1000)
train_index <- sample(1:length(mydata$y), size = round(length(mydata$y)*0.7), replace = FALSE)
valid_index <- setdiff(1:length(mydata$y), train_index)
all.equal(length(mydata$y), sum(length(train_index), length(valid_index)))
traindata <- list(x = mydata$x[train_index,], y = mydata$y[train_index], s = 1:93)
validdata <- list(x = mydata$x[valid_index,], y = mydata$y[valid_index], s = 1:93)
traindata.df <- data.frame(y = traindata$y, x = I(traindata$x))
validdata.df <- data.frame(y = validdata$y, x = I(validdata$x))

fit.pfr.af <- pfr(y ~ af(x, Qtransform=TRUE, k=c(7,7)), data = traindata.df)
fit.fpc <- FDboost(y ~ bfpc(x, s, pve = 0.99), data = traindata, timeformula = NULL, control = boost_control(mstop = 19))
fit.fpco.dtw <- FDboost(y ~ bfpco(x, s, distType = "dtw", penatly = "identity", pve = 0.99, 
                                  window.type = "sakoechiba", window.size = 5), 
                        data = traindata, timeformula = NULL, control = boost_control(mstop = 30))

pred.pfr.af <- predict(fit.pfr.af, newdata = validdata.df) # how to predict use pfr
pred.pfc <- predict(fit.fpc, newdata = list(x = validdata$x, s = validdata$s))
pred.fpco.dtw <-predict(fit.fpco.dtw, newdata = list(x = validdata$x, s = validdata$s))

mse.intercept <- mean((validdata$y - mean(traindata$y))^2)
mse.pfr.af <- mean((validdata$y - pred.pfr.af)^2)
mse.pfc <- mean((validdata$y - pred.pfc)^2)
mse.fpco.dtw <- mean((validdata$y - pred.fpco.dtw)^2)

print(c(mse.intercept = mse.intercept, mse.pfr.af = mse.pfr.af, mse.pfc = mse.pfc, mse.fpco.dtw = mse.fpco.dtw))
# mse.intercept    mse.pfr.af     mse.pfc       mse.fpco.dtw 
# 149.6382          133.7873      137.7277      136.6862 

ord <- order(validdata$y)
matplot(data.frame(y.valid = validdata$y[ord], pred.pfr.af = pred.pfr.af[ord], 
                   pred.pfc = pred.pfc[ord], pred.fpco.dtw = pred.fpco.dtw[ord]), 
        type = "l", col = 1:4, pch = 16:19, main = "predicted value", xlab = "Nr", ylab = "")
legend(x = 70,y = 10, legend = c("y.valid", "pred.pfr.af", "pred.pfc"," pred.fpco.dtw"), col = 1:4, lty = c(1,1,1,1), cex = 0.7)

rplot

ailinweili commented 7 years ago

In addition, I found the best mstop kind of strange. The best number of iteration rarely exceeds 100. I have tried to rerun some single models, and check the results, the best mstop are still less than 100. Is it normal? What does the fact imply?

# $wrap.FDboost.fpco.minkowski
#     distType  p dnr  pve   add fastcmd mstop
# 1  Minkowski  1   1 0.95  TRUE    TRUE    42
# 2  Minkowski  2   2 0.95  TRUE    TRUE    20
# 3  Minkowski  5   3 0.95  TRUE    TRUE    49
# 4  Minkowski 10   4 0.95  TRUE    TRUE    37
# 5  Minkowski  1   1 0.70  TRUE    TRUE    29
# 6  Minkowski  2   2 0.70  TRUE    TRUE    15
# 7  Minkowski  5   3 0.70  TRUE    TRUE    41
# 8  Minkowski 10   4 0.70  TRUE    TRUE    40
# 9  Minkowski  1   1 0.95 FALSE    TRUE    20
# 10 Minkowski  2   2 0.95 FALSE    TRUE    16
# 11 Minkowski  5   3 0.95 FALSE    TRUE    26
# 12 Minkowski 10   4 0.95 FALSE    TRUE    33
# 13 Minkowski  1   1 0.70 FALSE    TRUE    23
# 14 Minkowski  2   2 0.70 FALSE    TRUE   998
# 15 Minkowski  5   3 0.70 FALSE    TRUE    16
# 16 Minkowski 10   4 0.70 FALSE    TRUE    16
# 17 Minkowski  1   1 0.95  TRUE   FALSE    48
# 18 Minkowski  2   2 0.95  TRUE   FALSE    27
# 19 Minkowski  5   3 0.95  TRUE   FALSE    45
# 20 Minkowski 10   4 0.95  TRUE   FALSE    42
# 21 Minkowski  1   1 0.70  TRUE   FALSE    38
# 22 Minkowski  2   2 0.70  TRUE   FALSE    15
# 23 Minkowski  5   3 0.70  TRUE   FALSE    46
# 24 Minkowski 10   4 0.70  TRUE   FALSE    42
# 25 Minkowski  1   1 0.95 FALSE   FALSE    28
# 26 Minkowski  2   2 0.95 FALSE   FALSE    20
# 27 Minkowski  5   3 0.95 FALSE   FALSE    28
# 28 Minkowski 10   4 0.95 FALSE   FALSE    32
# 29 Minkowski  1   1 0.70 FALSE   FALSE    18
# 30 Minkowski  2   2 0.70 FALSE   FALSE    17
# 31 Minkowski  5   3 0.70 FALSE   FALSE    15
# 32 Minkowski 10   4 0.70 FALSE   FALSE    18

# $wrap.FDboost.fpco.elasticMetric
#        distType dnr   add  pve fastcmd mstop
# 1 elasticMetric   1  TRUE 0.95    TRUE    15
# 2 elasticMetric   1 FALSE 0.95    TRUE    22
# 3 elasticMetric   1  TRUE 0.70    TRUE    14
# 4 elasticMetric   1 FALSE 0.70    TRUE     8
# 5 elasticMetric   1  TRUE 0.95   FALSE     9
# 6 elasticMetric   1 FALSE 0.95   FALSE    11
# 7 elasticMetric   1  TRUE 0.70   FALSE    19
# 8 elasticMetric   1 FALSE 0.70   FALSE    18

# $wrap.FDboost.fpco.correlation
#      distType dnr  pve   add fastcmd mstop
# 1 correlation   1 0.95  TRUE    TRUE    33
# 2 correlation   1 0.70  TRUE    TRUE    22
# 3 correlation   1 0.95 FALSE    TRUE    30
# 4 correlation   1 0.70 FALSE    TRUE    23
# 5 correlation   1 0.95  TRUE   FALSE    26
# 6 correlation   1 0.70  TRUE   FALSE    27
# 7 correlation   1 0.95 FALSE   FALSE    30
# 8 correlation   1 0.70 FALSE   FALSE    22

# $wrap.FDboost.fpco.dtw
#   distType window.type window.size dnr mstop
# 1      dtw  sakoechiba           5   1    40
# 2      dtw     itakura           5   2    19
# 3      dtw        none           5   3    16
# 4      dtw  sakoechiba          10   4    24
# 5      dtw     itakura          10   5    29
# 6      dtw        none          10   6    16
# 7      dtw  sakoechiba          20   7    18
# 8      dtw     itakura          20   8    24
# 9      dtw        none          20   9    19
fabian-s commented 7 years ago

DTI is a pretty noisy data set, the brain scan result on its own just doesn't explain much variance of the PASAT score. That's to be expected. That's also the reason why mstop is so low -- the models start to overfit quickly, so few iterations are sufficient to achieve the best predictive performance.

Let's see how the other datasets perform, I think your code is sound.