Closed MaggaP closed 6 years ago
Hi Ryan, there was a bug, I've just fixed it, you can upgrade with
pip uninstall hep_ml --yes
pip install https://github.com/arogozhnikov/hep_ml/archive/master.zip
Here is fixed working example:
from hep_ml.uboost import uBoostClassifier
from rep.metaml import FoldingScorer, RandomParameterOptimizer, GridOptimalSearchCV
from rep.report.metrics import RocAuc
from rep.estimators import SklearnClassifier
import pandas
import numpy as np
df = pandas.DataFrame(np.random.randn(300, 4),columns=['A', 'B', 'C', 'D'])
df['E'] = 1
df['E'][::2] = 0
labels = df['E']
data = df.drop('E',axis=1)
uni_feats = ['C']
variables = ['A','B','D']
uboost_clf = uBoostClassifier(uniform_features=uni_feats, uniform_label=1, train_features=variables)
grid_param = {}
grid_param['n_estimators'] = [50,100,125,150]
grid_param['n_neighbors'] = [50,51,52,53]
generator = RandomParameterOptimizer(grid_param, n_evaluations=2)
scorer = FoldingScorer(RocAuc(), folds=3, fold_checks=3)
estimator = SklearnClassifier(uboost_clf)
grid_finder = GridOptimalSearchCV(uboost_clf, generator, scorer, parallel_profile='threads-4')
grid_finder.fit(data, labels.values)
Hi Alex,
Thanks for taking a look at this so soon, I can confirm it's now working for me too.
Cheers, Ryan
Hi,
I have been trying to add uBoost to my grid search in REP and have encountered some difficulties. I have made a minmal example of the error I get: ` df = pandas.DataFrame(np.random.randn(8, 4),columns=['A', 'B', 'C', 'D'])
df['E'] = 1
df['E'][3:] = 0
labels = df['E']
data = df.drop('E',axis=1)
uni_feats = 'C'
variables = ['A','B','D']
uboost_clf = uBoostClassifier(uniform_features=uni_feats, uniform_label=1,
train_features=variables)
grid_param = {}
grid_param['n_estimators'] = [50,100,125,150]
grid_param['n_neighbors'] = [50,51,52,53]
generator = RandomParameterOptimizer(grid_param,n_evaluations=2)
scorer = FoldingScorer(RocAuc(), folds=3, fold_checks=3)
estimator = SklearnClassifier(uboost_clf)
grid_finder = GridOptimalSearchCV(estimator, generator, scorer, parallel_profile='threads-4')
grid_finder.fit(data, labels)
`
This always results in the error:
Performing grid search in 4 threads ERROR:rep.metaml.gridsearch:Fail during training on the node Exception an integer is required Parameters n_estimators=150, n_neighbors=52 ERROR:rep.metaml.gridsearch:Fail during training on the node Exception an integer is required Parameters n_estimators=125, n_neighbors=52 2 evaluations done
I have had a look but i've had no luck finding the source of the exception and im a bit puzzled as to what is causing this, the same code works for a number of other classifiers.
Is this just a case of something which is not supported by uboost?
Any help or clarification here would be greatly appreciated, Ryan