openedx-unsupported / ease

EASE (Enhanced AI Scoring Engine) is a library that allows for machine learning based classification of textual content. This is useful for tasks such as scoring student essays.
GNU Affero General Public License v3.0
216 stars 96 forks source link

get_confidence_value() => AttributeError: 'GradientBoostingRegressor' object has no attribute 'predict_proba' #31

Closed kern3020 closed 11 years ago

kern3020 commented 11 years ago

Hello,

I'm looking into why the confidence is zero. I noticed this in the log file.

[2013-06-04 13:08:49,524: ERROR/MainProcess] Problem generating confidence value
Traceback (most recent call last):
  File "/opt/edx/local/lib/python2.7/site-packages/ease-0.1-py2.7.egg/ease/grade.py", line 70, in grade
    results['confidence'] = get_confidence_value(grader_data['algorithm'], grader_data['model'], grader_feats, results['score'], grader_data['score'])
  File "/opt/edx/local/lib/python2.7/site-packages/ease-0.1-py2.7.egg/ease/grade.py", line 166, in get_confidence_value
    raw_confidence=model.predict_proba(grader_feats)[0,(float(score)-float(min_score))]
AttributeError: 'GradientBoostingRegressor' object has no attribute 'predict_proba'

When I inspect the GradientBoostingRegressor class, I wasn't able to file the 'predict_proba' attribute.

>>> sklearn.__version__
'0.12.1'
>>> type (clf)
<class 'sklearn.ensemble.gradient_boosting.GradientBoostingRegressor'>
>>> dir(clf)
['__abstractmethods__', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getitem__', '__hash__', '__init__', '__len__', '__metaclass__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_abc_cache', '_abc_negative_cache', '_abc_negative_cache_version', '_abc_registry', '_get_param_names', '_init_decision_function', '_make_estimator', 'alpha', 'decision_function', 'estimators_', 'feature_importances_', 'fit', 'fit_stage', 'get_params', 'init', 'learn_rate', 'loss', 'loss_', 'max_depth', 'max_features', 'min_samples_leaf', 'min_samples_split', 'n_classes_', 'n_estimators', 'n_features', 'oob_score_', 'predict', 'random_state', 'score', 'set_params', 'staged_decision_function', 'staged_predict', 'subsample', 'train_score_']
>>> 

-jk

VikParuchuri commented 11 years ago

Yeah, it will use a classification or a regression algorithm based on how many score points they are. Classifiers deal with probabilities, regressors don't. Will have to fix this to call the appropriate method depending on the instance type.

On Tue, Jun 4, 2013 at 3:59 PM, John Kern notifications@github.com wrote:

Hello,

I'm looking into why the confidence is zero. I noticed this in the log file.

[2013-06-04 13:08:49,524: ERROR/MainProcess] Problem generating confidence value Traceback (most recent call last): File "/opt/edx/local/lib/python2.7/site-packages/ease-0.1-py2.7.egg/ease/grade.py", line 70, in grade results['confidence'] = get_confidence_value(grader_data['algorithm'], grader_data['model'], grader_feats, results['score'], grader_data['score']) File "/opt/edx/local/lib/python2.7/site-packages/ease-0.1-py2.7.egg/ease/grade.py", line 166, in get_confidence_value raw_confidence=model.predict_proba(grader_feats)[0,(float(score)-float(min_score))] AttributeError: 'GradientBoostingRegressor' object has no attribute 'predict_proba'

When I inspect the GradientBoostingRegressor class, I wasn't able to file the 'predict_proba' attribute.

sklearn.version '0.12.1' type (clf) <class 'sklearn.ensemble.gradient_boosting.GradientBoostingRegressor'> dir(clf) ['abstractmethods', 'class', 'delattr', 'dict', 'doc', 'format', 'getattribute', 'getitem', 'hash', 'init', 'len', 'metaclass', 'module', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'weakref', '_abc_cache', '_abc_negative_cache', '_abc_negative_cache_version', '_abc_registry', '_get_param_names', '_init_decision_function', '_make_estimator', 'alpha', 'decisionfunction', 'estimators', 'featureimportances', 'fit', 'fit_stage', 'get_params', 'init', 'learnrate', 'loss', 'loss', 'max_depth', 'max_features', 'min_samples_leaf', 'min_samples_split', 'nclasses', 'n_estimators', 'n_features', 'oobscore', 'predict', 'random_state', 'score', 'set_params', 'staged_decision_function', 'staged_predict', 'subsample', 'trainscore']

-jk

— Reply to this email directly or view it on GitHubhttps://github.com/edx/ease/issues/31 .

VikParuchuri commented 11 years ago

Fixed.