topepo / caret

caret (Classification And Regression Training) R package that contains misc functions for training and plotting classification and regression models
http://topepo.github.io/caret/index.html
1.61k stars 632 forks source link

more questions on findLinearCombo #1296

Open gzagatti opened 2 years ago

gzagatti commented 2 years ago

I was trying to understand the code behind findLinearCombo and I had a few questions about its implementation.

I have noticed that the first step of the algorithm:

   lcList <- enumLC(x)

is to perform a QR decomposition and then recover all dependent columns. The lcList contains all the linear combos in x. Each item of the list is a list whose first element is the dependent column of the combo:

tmp <- unlist(lapply(lcList, function(x) x[1]))

Thus, my understanding is that after removing tmp there should not be any more linear dependencies in the data. However, you still run a while-loop after the first pass of enumLC where you basically apply enumLC against the data removed of the tmp columns above? A single pass would not be sufficient?

Thanks for developing this amazing package, makes quick model iteration so much simpler.