Open fabian-s opened 7 years ago
(how) does it work? can you post some code below?
Hi, I did not realize I closed this issue. I think it was by accident.......sorry
Hi, I wrote an example on how to use bfpco base learner for multinomial distibuted FDboost. Actually I could not find many materials about how to correctly use "%O%" term for multinomial family. So I tried the bols( ) for dummy response variable, and it succeeded. The prediction accuracy is not bad.
# exmples on multinumial regression utilizing fuelsubset dataset
library(FDboost)
data("fuelSubset")
## modelling data
myfuel <- fuelSubset
myfuel$heatanclass <- cut(fuelSubset$heatan, breaks = c(5,15,25,35), labels = c("a","b","c"))
### define a dummy vector with one factor level less than the outcome,
### which is used as reference category.
myfuel$heatandummy <- factor(levels(myfuel$heatanclass)[-nlevels(myfuel$heatanclass)])
## fit a multinomial FDboost model
mlm1 <- FDboost(heatanclass ~
bfpco(UVVIS, s = uvvis.lambda, df = 4) %O%
bols(heatandummy,df = 4, contrasts.arg = "contr.dummy") +
bfpco(NIR, s = nir.lambda, df = 4) %O%
bols(heatandummy, df = 4, contrasts.arg = "contr.dummy") ,
timeformula = ~bols(1), data = myfuel, family = Multinomial(),
control = boost_control(mstop = 200))
## model performance
### contingency table
tab1 <- table(data = myfuel$heatanclass, fitted = predict(mlm1, type = "class"))
print(tab1)
### compute prediction accurracy
print(sum(diag(tab1))/sum(tab1))
## prediction on newdata
### prepare new data
set.seed(201)
index <- sample(1:length(myfuel$heatan), size = 50)
newdata <- list()
newdata$NIR <- myfuel$NIR[index, ]
newdata$UVVIS <- myfuel$UVVIS[index, ]
newdata$nir.lambda <- myfuel$nir.lambda
newdata$uvvis.lambda <- myfuel$uvvis.lambda
newdata$heatandummy <- myfuel$heatandummy
### prediction effect
tab2 <- table(myfuel$heatanclass[index], predict(mlm1, newdata = newdata, type = "class"))
print(tab2)
print(sum(diag(tab2))/sum(tab2))
Hi,
After updating to the latest version from Github, I get:
mlm1 <- FDboost(heatanclass ~
bfpco(UVVIS, s = uvvis.lambda, df = 4) %O% bols(heatandummy, df = 4, contrasts.arg = "contr.dummy") +
bfpco(NIR, s = nir.lambda, df = 4) %O% bols(heatandummy, df = 4, contrasts.arg = "contr.dummy") ,
timeformula = ~bols(1), data = myfuel, family = Multinomial(),
control = boost_control(mstop = 200))
# Error in dist(Y.tilde, method = distType, ...) : invalid distance method
traceback()
#13: stop("invalid distance method")
#12: dist(Y.tilde, method = distType, ...) at baselearners.R#2168
#11: (function (Y = NULL, Y.pred = NULL, center = FALSE, random.int = FALSE,
# nbasis = 10, argvals = NULL, distType = NULL, npc = NULL,
# npc.max = NULL, pve = 0.99, ...)
# {
# if (is.null(Y.pred))
# ...
#10: do.call(fpco.sc, decomppars) at baselearners.R#1884
#9: X_fpco(mf, vary, args = hyper_fpco(mf, vary, df = df, lambda = lambda,
# pve = pve, npc = npc, npc.max = npc.max, s = s, distType = distType,
# ...)) at baselearners.R#2419
#8: bfpco(UVVIS, s = uvvis.lambda, df = 4)
#7: bfpco(UVVIS, s = uvvis.lambda, df = 4) %O% bols(heatandummy,
# df = 4, contrasts.arg = "contr.dummy")
#6: inherits(a, "blg")
#5: bfpco(UVVIS, s = uvvis.lambda, df = 4) %O% bols(heatandummy,
# df = 4, contrasts.arg = "contr.dummy") + bfpco(NIR, s = nir.lambda,
# df = 4) %O% bols(heatandummy, df = 4, contrasts.arg = "contr.dummy")
#4: eval(expr, envir, enclos)
#3: eval(as.expression(formula[[3]]), envir = c(as.list(data), list(`+` = get("+"))),
# enclos = environment(formula))
#2: mboost(fm, data = data, weights = w, offset = offset, ...) at FDboost.R#1115
#1: FDboost(heatanclass ~ bfpco(UVVIS, s = uvvis.lambda, df = 4) %O%
# bols(heatandummy, df = 4, contrasts.arg = "contr.dummy") +
# bfpco(NIR, s = nir.lambda, df = 4) %O% bols(heatandummy,
# df = 4, contrasts.arg = "contr.dummy"), timeformula = ~bols(1),
# data = myfuel, family = Multinomial(), control = boost_control(mstop = 200))
If I do
mlm1 <- FDboost(heatanclass ~
bfpco(UVVIS, s = uvvis.lambda, df = 4, distType = "DTW") %O% bols(heatandummy, df = 4, contrasts.arg = "contr.dummy") +
bfpco(NIR, s = nir.lambda, df = 4, distType = "DTW") %O% bols(heatandummy, df = 4, contrasts.arg = "contr.dummy") ,
timeformula = ~bols(1), data = myfuel, family = Multinomial(),
control = boost_control(mstop = 200))
instead, it works. Seems like bfpco
doesn't hand over the default argument for distType
correctly if it's not given explicitly.... ?
Hi, Fabian. This is mainly because library(dtw) is not called at the very beginning.
I see... Still, that's a bug -- make sure that you import all the functions from other packages that you use in your code. Similar with slanczos
from mgcv
(and probably more), e.g. devtools::check()
gives me:
* checking R code for possible problems ... NOTE
X_fpco: no visible global function definition for ‘dist’
cmdscale_lanczos_new: no visible global function definition for
‘slanczos’
fpco.sc: no visible global function definition for ‘gamm4’
fpco.sc: no visible global function definition for ‘dist’
Undefined global functions or variables:
dist gamm4 slanczos
Consider adding
importFrom("stats", "dist")
... but I think you'd probably want importFrom("proxy", "dist")
. Also, please do so by using roxygen2, do not edit NAMESPACE
manually.
Yeah, I have updated the documentation
... so we can do the Phoneme dataset for benchmark