Closed antonioggsousa closed 2 years ago
I guess there are two strategies for multi-class prediction:
I don't know which one is used by default in {LiblineaR}, but that shouldn't be too hard to implement with a for loop I guess.
Hi @privefl,
Thank you for your answer. I can give it a try.
I was just afraid that I'll make things slower that way.
Can I disable cross-validation? At least in LiblineaR
seems possible to do it, but here I tried to give 1 to K
and I was unable to run it that way.
Again, thank you for your answer and great package.
António
No, it's not currently possible to disable the crossval.
Thank you.
This was what I've tried (one vs all) and it seems to even predict versicolor
where LiblineaR
is unable though it takes more time to run.
multi_LR <- function(data, r.train, r.test, preds.test) {
set.seed(1024)
X <- as_FBM(data)
cm <- model.matrix(~0+target, data.frame("target"=preds.test))
colnames(cm) <- levels(factor(preds.test))
out <- lapply(1:ncol(cm), function(x) {
y01 <- cm[,x]
res <- big_spLogReg(X, y01, ind.train=r.train,
covar.train=NULL,
alphas=1, warn=FALSE,
K=2, ncores=3)
p <- predict(res, X, ind.row=r.test, covar.row=NULL)
list("prediction"=p, "model"=res)
})
names(out) <- colnames(cm)
return(out)
}
# test
set.seed(1024)
multi.test <- multi_LR(data=iris[,-ncol(iris)], r.train=as.numeric(row.names(train)),
r.test=as.numeric(row.names(test)), preds.test=train[,ncol(train)])
preds <- lapply(1:length(multi.test), function(x) multi.test[[x]]$prediction)
comb.preds <- do.call("cbind", preds)
colnames(comb.preds) <- names(multi.test)
Sounds interesting
What do you expect from me exactly?
I was just wondering if you had some suggestion to make it faster, just in case I was using it wrong.
I guess you can close the issue.
Thank you for your answers.
Is it really slow? What is the size of your data?
And how many models do you have to run? (i.e. how many classes do you have)
I didn't try big_spLogReg()
with my data yet. I was just testing it to see how it looked like compared with LiblineaR
which is what we're using.
We're running multi-class L1-regularized LR (using LiblineaR
) for around 200 epochs with matrices with several thousands or rows and columns, e.g., 13,000 x 10,000. For each epoch there are around 15 classes.
I was just trying to find a way to speed this up. I don't know if using big_spLogReg()
instead of LiblineaR
would require less epochs and the overall running time would be lower.
Anyway thank you for your time and interest.
P.S.: when I said that was slow, I was comparing that with LibLinear
for the toy iris
data, but this is also probably due to the cross-validation step. Although I didn't test this for bigger/realistic data sets where it may scale better and overcome LiblineaR
. I was just trying to get some advice, before trying to implement this.
What are you calling "epochs" here exactly?
To make this fast enough, I would parallelize, e.g. the loop on the 15 classes (e.g. with foreach).
You can also enable parallelization within the function over the K
folds.
Depending on the exact size of your data, I would go for something like
K=4
and ncores=2
floor(nb_cores() / 2)
processesBy epoch
I mean each time that the LR model is run again. We want to used the LR ability to learn and find important proprieties from the data. But every time that is run is using a slightly different train set.
Thank you for your valuable advice and recommendations.
I'll follow them.
Hi!
I came across this package today. It seems a great package! Thank you for developing it.
I'm using
LiblinearR
package for a multi-class L1-regularized logistic regression task and I would like to test if I could usebig_spLogReg()
to speed it up and be more memory efficient.Although if I understood correctly the function
big_spLogReg()
aims to perform L1 regularized logistic regression without being adapted to perform multi-class classification. Would this be easy to implement/adapt?I tested this with the toy
iris
data set and it works really well but as far as I understood it works for binary classifications,0
or1
.Thank you!
Best regards,
António