mlr-org / mlr

Machine Learning in R
https://mlr.mlr-org.com
Other
1.63k stars 402 forks source link

List of Possible New Learners #257

Closed kerschke closed 8 years ago

kerschke commented 9 years ago

I just had a look at caret and made a list of learners, which have not yet been integrated into mlr (at least as far as I know).

          Package     Function                                      Brief Description
1         adaptDA        amdai                 Adaptive Mixture Discriminant Analysis
2             arm     bayesglm                      Bayesian Generalized Linear Model
3           binda        binda                           Binary Discriminant Analysis
4             bst        bstLs                                   Boosted Linear Model
5             C50         C5.0                                                   C5.0
6           caret          bag                                           Bagged Model
7           caret     bagEarth                                            Bagged MARS
8           caret      pcaNNet                Neural Networks with Feature Extraction
9         caTools   LogitBoost                            Boosted Logistic Regression
10        deepnet          dnn                Stacked AutoEncoder Deep Neural Network
11     elasticnet         enet                                             Elasticnet
12          elmNN          elm                               Extreme Learning Machine
13          enpls        enpls              Ensemble Partial Least Squares Regression
14         evtree       evtree                    Tree Models from Genetic Algorithms
15        fastICA          icr                       Independent Component Regression
16           foba         foba               Ridge Regression with Variable Selection
17           frbs   frbs.learn                               Fuzzy rule-based Systems
18            gam          gam                            Generalized Additive Models
19           gpls         gpls                      Generalized Partial Least Squares
20      HDclassif         hdda                 High Dimensional Discriminant Analysis
21        HiDimDA         Mlda       Maximum Uncertainty Linear Discriminant Analysis
22        HiDimDA        RFlda              Factor-Based Linear Discriminant Analysis
23          ipred         slda                Stabilized Linear Discriminant Analysis
24          ipred    ipredbagg                                            Bagged CART
25           KRLS     krlsPoly            Polynomial Kernel Regularized Least Squares
26           KRLS         krls Radial Basis Function Kernel Regularized Least Squares
27           lars         lars                                 Least Angle Regression
28        logicFS     logicBag                                Bagged Logic Regression
29         mboost     gamboost                     Boosted Generalized Additive Model
30           mgcv          gam               Generalized Additive Model using Splines
31      neuralnet    neuralnet                                         Neural Network
32           nnet       avNNet                          Model Averaged Neural Network
33    nodeHarvest  nodeHarvest                                   Tree-Based Ensembles
34   oblique.tree oblique.tree                                          Oblique Trees
35      obliqueRF    obliqueRF                                  Oblique Random Forest
36           pamr   pamr.train                             Nearest Shrunken Centroids
37        partDSA      partDSA                                                partDSA
38   penalizedLDA PenalizedLDA                 Penalized Linear Discriminant Analysis
39            pls          pls                                  Partial Least Squares
40        plsRglm      plsRglm       Partial Least Squares Generalized Linear Models 
41           qrnn         qrnn                     Quantile Regression Neural Network
42 quantregForest          qrf                                 Quantile Random Forest
43   randomForest        parRF                                 Parallel Random Forest
44         relaxo       relaxo                                          Relaxed Lasso
45         rFerns       rFerns                                           Random Ferns
46           rknn         rknn                             Random k-Nearest Neighbors
47       robustDA         rmda                   Robust Mixture Discriminant Analysis
48           rocc      tr.rocc                                   ROC-Based Classifier
49          rrcov       QdaCov                 Robust Quadratic Discriminant Analysis
50        rrcovHD       CSimca                                                  SIMCA
51        rrcovHD       RSimca                                           Robust SIMCA
52            RRF          RRF                              Regularized Random Forest
53            RRF    RRFglobal                              Regularized Random Forest
54          RSNNS          mlp                                 Multi-Layer Perceptron
55          RSNNS          rbf                          Radial Basis Function Network
56          RSNNS       rbfDDA                          Radial Basis Function Network
57          RWeka          LMT                                   Logistic Model Trees
58          RWeka           M5                                             Model Tree
59          RWeka      M5Rules                                            Model Rules
60           SDDA      sddaLDA         Stepwise Diagonal Linear Discriminant Analysis
61           SDDA      sddaQDA      Stepwise Diagonal Quadratic Discriminant Analysis
62      sparseLDA         smda                   Sparse Mixture Discriminant Analysis
63      sparseLDA    sparseLDA                    Sparse Linear Discriminant Analysis
64           spls         spls                           Sparse Partial Least Squares
65          stats          ppr                          Projection Pursuit Regression
66        superpc      superpc                Supervised Principal Component Analysis
67           vbmp   vbmpRadial     Variational Bayesian Multinomial Probit Regression
68           wsrf         wsrf                        Weighted Subspace Random Forest

This list is very likely not complete and might also include some functions, which might not be useful for our purposes. But at least, we now have an overview of possible extensions :-)

rishy commented 9 years ago

Currently I am working on implementing mgcv::gam in mlr. And then I'd carry on with implementation of regularized random forest, relaxed lasso and bagged CART as well. So, maybe some other contributor may start with other learners.

berndbischl commented 9 years ago

Thx. Please post here, if you look at something so effort is not duplicated

kerschke commented 9 years ago

Working on sparseLDA::sparseLDA.

hetong007 commented 9 years ago

I think I will focus on adding neural network related learners:

I intended to add xgboost first, but soon figured out that it has been added already.

berndbischl commented 8 years ago

@hetong007 What is the status here? Have you added the learners? Do I need to look at the PRs?

hetong007 commented 8 years ago

@berndbischl I have hold it down because it has some confliction, and I want to put SwarmSVM in the PR first. I am waiting for SwarmSVM to appear on cran and open the PR. I could also add these codes together in the PR.

berndbischl commented 8 years ago

why not do it now for the learners that are already on cran? that is much easier

hetong007 commented 8 years ago

Then can I make a new fork and put the code for neuralnet first?

berndbischl commented 8 years ago

sure

berndbischl commented 8 years ago

deepnet would also be nice

hetong007 commented 8 years ago

Actually I found out the deepnet is not so usable. The model quality is really low. I tested the deepnet::sae.dnn.train and deepnet::nn.train and had a hard time to tune the parameters but still got really bad on the binaryclass data. Even on the demo data set provided in its examples, the model predict every data point with almost the same result (probability).

berndbischl commented 8 years ago

Tong, please dont judge the learners at this level. To properly benchmark them, we need them in mlr. Please add the deepnet learner nonetheless

hetong007 commented 8 years ago

Now the situation is they can't pass my local check in test_learners_classiflabelswitch.R: https://github.com/mlr-org/mlr/blob/master/tests/testthat/test_learners_classiflabelswitch.R#L63

I have them grammar correctly implemented in my fork of mlr, but the performance seems just not strong enough to pass this test.

larskotthoff commented 8 years ago

Then adjust the error rate threshold in the test.

hetong007 commented 8 years ago

Thanks, now it passes the check.

florianfendt commented 8 years ago

working on rknn::rknn

berndbischl commented 8 years ago

we should move this to the wiki and update it a bit

berndbischl commented 8 years ago

@florianfendt Please do this. Just have a smaller list of models in the wiki and give a few helpful pointers to people how to integrate them

berndbischl commented 8 years ago

then close here

florianfendt commented 8 years ago

ok, will do that after we talked about it on Monday!

berndbischl commented 8 years ago

we have that list in the wiki now. closing