Closed rhiever closed 8 years ago
Just to bug you even more, @rasbt, check out https://github.com/rhiever/tpot/commit/96c69a4fd260e1dfeeed1e360943a86803204cb7 for the implementation of these new classifiers.
Any other parameters that you would include?
Would you implement any parameters differently?
@amueller, do you have any advice on what additional model parameters to open up to evolution?
Nice to see classifiers are coming along!
One thing I've noticed is you have a ton of repetitive code and it seems like you could abstract the implementation of all these methods to just a generic_regressor/generic_classifier function and a common structure for storing the regressors/parameters. Have you tried any generic approach yet?
ANNs are in dev. They don't have dropout yet, but will soon.
How about gradient boosting?
@Chris7: I noticed that today too when I was implementing the new classifiers. I'll look into abstracting the common bits next week, as that would indeed save quite a lot of repeated code.
@amueller: :+1: Looking forward to ANNs in sklearn. I'll add gradient boosting as well. Do you have a sense for what are the 2-3 most important parameters (if there are that many) for each model? I've tinkered around with various parameters for various models but don't have a comprehensive view of them like you might.
We want to add a "Default grid" https://github.com/scikit-learn/scikit-learn/pull/5564 but it is somewhat stalled. I'm crazy busy at the moment but I hope to work on that soonish.
Nice. Looks like this will be a good start. Thank you @amueller!
_DEFAULT_PARAM_GRIDS = {'AdaBoostClassifier':
[{'learning_rate': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'AdaBoostRegressor':
[{'learning_rate': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'DecisionTreeClassifier':
[{'max_features': ["auto", None]}],
'DecisionTreeRegressor':
[{'max_features': ["auto", None]}],
'ElasticNet':
[{'alpha': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'GradientBoostingClassifier':
[{'max_depth': [1, 3, 5]}],
'GradientBoostingRegressor':
[{'max_depth': [1, 3, 5]}],
'KNeighborsClassifier':
[{'n_neighbors': [1, 5, 10, 100],
'weights': ['uniform', 'distance']}],
'KNeighborsRegressor':
[{'n_neighbors': [1, 5, 10, 100],
'weights': ['uniform', 'distance']}],
'Lasso':
[{'alpha': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'LinearRegression':
[{}],
'LinearSVC':
[{'C': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'LogisticRegression':
[{'C': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'SVC': [{'C': [0.01, 0.1, 1.0, 10.0, 100.0],
'gamma': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'MultinomialNB':
[{'alpha': [0.1, 0.25, 0.5, 0.75, 1.0]}],
'RandomForestClassifier':
[{'max_depth': [1, 5, 10, None]}],
'RandomForestRegressor':
[{'max_depth': [1, 5, 10, None]}],
'Ridge':
[{'alpha': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'SGDClassifier':
[{'alpha': [0.000001, 0.00001, 0.0001, 0.001, 0.01],
'penalty': ['l1', 'l2', 'elasticnet']}],
'SGDRegressor':
[{'alpha': [0.000001, 0.00001, 0.0001, 0.001, 0.01],
'penalty': ['l1', 'l2', 'elasticnet']}],
'LinearSVR':
[{'C': [0.01, 0.1, 1.0, 10.0, 100.0]}],
'SVR':
[{'C': [0.01, 0.1, 1.0, 10.0, 100.0],
'gamma': [0.01, 0.1, 1.0, 10.0, 100.0]}]}
feedback welcome. I haven't actually reviewed this, not sure if someone else has ;)
I'll drop some comments in there.
Going to close this issue and open a new one for expanding classifier parameter search.
Similar to the Decision Tree and Random Forest classifier pipeline operators, also implement:
@rasbt, do you think we should add any more than this? I'd like to add ANNs eventually, but since they're not directly supported in sklearn, that will wait for a later time.