No-Arg Constructors? - Githubissues

salamanders commented 9 years ago

I did a bad thing :) and used reflection to try to instantiate, train, and test every classifier you got, using brain-dead no-arg or simple arg constructors. With autoAddParameters, cause why not.

https://gist.github.com/salamanders/8e7054f62b53eb772895

It exploded all over the place, of course. Which is why I'm so interested in as many classifiers as possible having a best-practice default.

Bagging - is there a weak classifier that in general can be assumed to be an "ok" starting point?
Caching & Parallel training - is there an interface possible for caching-enabled trainers?
Incompatible data - is there a way to upgrade a one-to-many if the classifier is expecting binary but the data assumes multiple?

On the plus side: when it works, it creates some really fun results that I don't think are nearly as easy to produce with competing libraries!

EdwardRaff commented 9 years ago

Hah, not bad at all! Thats some nifty code actually!

Bagging - is there a weak classifier that in general can be assumed to be an "ok" starting point?

A Decision Tree is the quintessential "weak" classifier. I would say a tree with a max depth of 6 is a good general purpose weak learner.

Caching & Parallel training - is there an interface possible for caching-enabled trainers?

Only classes extending the SupportVectorLearner class use a cache.

Incompatible data - is there a way to upgrade a one-to-many if the classifier is expecting binary but the data assumes multiple?

I'm not sure I understand what you are asking here. Mind rephrasing?

EDIT: If you have the time, it would be cool to see a plot on the difference in error rate between default parameters and the results of autoAddParamters w/ RandomSearch on a couple datasets.

salamanders commented 9 years ago

"I'm not sure I understand what you are asking here. Mind rephrasing?"

Errr... I'm not entirely sure. I got some that error with "Network needs categorical attributes to work" or "At least 2 categorical variables are needed for ODE" and was wondering if there was a generic way to wrap data or wrap the classifier to make these run. Like using one-vs-rest to turn a binary classifier into a multiple class classifier.

salamanders commented 9 years ago

re: autoAddParamters: I'm not getting the results I think I should be getting. Why would some be negative?

diabetes.svm:

Classifier	Time	Original	Improvment
jsat.classifiers.linear.AROW	444	0.310	-0.008
jsat.classifiers.svm.extended.OnlineAMM	573	0.257	0.001
jsat.classifiers.svm.Pegasos	452	0.327	0.066
jsat.classifiers.neuralnetwork.RBFNet	712	0.299	0.000
jsat.classifiers.trees.ExtraTree	279	0.289	0.000
jsat.classifiers.PriorClassifier	35	0.349	0.000
jsat.classifiers.linear.LogisticRegressionDCD	736	0.229	0.010
jsat.classifiers.linear.LinearBatch	2019	0.224	0.026
jsat.classifiers.MultinomialLogisticRegression	589	0.228	0.000
jsat.classifiers.boosting.LogitBoost	2960	0.308	0.000
jsat.classifiers.knn.NearestNeighbour	2203	0.259	-0.001
jsat.classifiers.linear.SCW	89	0.349	0.109
jsat.classifiers.svm.PlatSMO	3852	0.229	-0.026
jsat.classifiers.Rocchio	18	0.267	0.000
jsat.classifiers.svm.extended.AMM	715	0.241	0.017
jsat.classifiers.linear.NewGLMNET	591	0.227	0.008
jsat.classifiers.linear.kernelized.KernelSGD	7016	0.225	0.033
jsat.classifiers.linear.StochasticMultinomialLogisticRegression	333	0.258	0.000
jsat.classifiers.svm.DCD	6585	0.225	0.000
jsat.classifiers.svm.DCDs	7584	0.225	-0.051

salamanders commented 9 years ago

But overall - my original request for "more classifiers with a sane no-args constructor please!" like including your advice for "a DecisionTree with depth 6 isn't by any means optimal - but it isn't crazy" in all the Boosting constructors.

EdwardRaff commented 9 years ago

The diabetes dataset is pretty hard and not a lot of data, your probably see bigger differences on larger datasets. MNIST could be a good benchmark

Some of those, like Rocchio, don't have any "easy" parameters set up yet - so you wouldn't expect to see a change. That's why the method returns the number of parameters added.

Other negatives could be that you are just at the best accuracy and the different scores are just random chance - so not really meaningful.

I'll reply more when I get home

Sent from my iPhone

On Jul 17, 2015, at 11:52 AM, salamanders notifications@github.com wrote:

But overall - my original request for "more classifiers with a sane no-args constructor please!" like including your advice for "a DecisionTree with depth 6 isn't by any means optimal - but it isn't crazy" in all the weak learners.

— Reply to this email directly or view it on GitHub.

EdwardRaff commented 9 years ago

Errr... I'm not entirely sure. I got some that error with "Network needs categorical attributes to work" or "At least 2 categorical variables are needed for ODE" and was wondering if there was a generic way to wrap data or wrap the classifier to make these run. Like using one-vs-rest to turn a binary classifier into a multiple class classifier.

I understand your problem now. You could wrap them using the DataModelPipeline and the NominalToNumeric and NumericalToHistogram transforms to get them to work. However, I don't have a programatic way to determine if the classifier can work with numeric or categorical features.

Note, that NumericalToHistogram is currently a little brain dead. There are 2 better ways of doing it that I've been meaning to implement but never got around to because I haven't.

salamanders commented 9 years ago

I'm getting by with the following, could easily do the same with others. If I knew which. Should the others implement BinaryScoreClassifier?

    // TODO: Should only wrap if necessary.
    final Classifier model = (model1 instanceof BinaryScoreClassifier
            || model1 instanceof LogitBoost
            || model1 instanceof LogitBoostPL
            || model1 instanceof LogisticRegressionDCD) ? new OneVSAll(model1) : model1;

EdwardRaff commented 9 years ago

Hmm, so some things like LogisticRegression I didn't have implement BinaryScoreClassifier because they already produce calibrated probabilities, and the class was introduced so that methods that didn't have probabilities could be calibrated. I'm not sure it would make sense to have every binary classifier implement that interface.

I suppose the 3 ways to add such a marker would be adding a new method to the Classifier interface, creating a new "marker" interface with no methods, or creating an annotation. I'll think about it.

EdwardRaff / JSAT

No-Arg Constructors? #13