jma127 / pyltr

Python learning to rank (LTR) toolkit
BSD 3-Clause "New" or "Revised" License
464 stars 107 forks source link

how to use GridSearchCV module in sklearn to tune parameters? #13

Closed jxfruit closed 5 years ago

jxfruit commented 6 years ago

Hi, tks for your work which is very significant. As I said above, how can I use GridSearchCV module in sklearn to tune parameters automatically? I looked up some ways, for example, inheriting classed BaseEstimator and RegressorMixin in module _modle.py. But I met a problem, the following are error infos:

File "E:/Python projects/others/lambdamart_t.py", line 105, in training_model gscv.fit(training_set[1], training_set[2], groups=training_set[0]) File "E:\Python35\lib\site-packages\sklearn\model_selection_search.py", line 638, in fit cv.split(X, y, groups))) File "E:\Python35\lib\site-packages\sklearn\externals\joblib\parallel.py", line 779, in call while self.dispatch_one_batch(iterator): File "E:\Python35\lib\site-packages\sklearn\externals\joblib\parallel.py", line 625, in dispatch_one_batch self._dispatch(tasks) File "E:\Python35\lib\site-packages\sklearn\externals\joblib\parallel.py", line 588, in _dispatch job = self._backend.apply_async(batch, callback=cb) File "E:\Python35\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 111, in apply_async result = ImmediateResult(func) File "E:\Python35\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 332, in init self.results = batch() File "E:\Python35\lib\site-packages\sklearn\externals\joblib\parallel.py", line 131, in call return [func(*args, *kwargs) for func, args, kwargs in self.items] File "E:\Python35\lib\site-packages\sklearn\externals\joblib\parallel.py", line 131, in return [func(args, kwargs) for func, args, kwargs in self.items] File "E:\Python35\lib\site-packages\sklearn\model_selection_validation.py", line 437, in _fit_and_score estimator.fit(X_train, y_train, fit_params) TypeError: fit() missing 1 required positional argument: 'qids'

I also implemented methods get_params() and set_params() in module lambdamart.py, however, the same error coming again as the above, and the implementing details are following codes:

 def get_params(self, deep=True):

    return {'metric': self.metric ,
            'learning_rate': self.learning_rate,
            'n_estimators': self.n_estimators,
            'query_subsample': self.query_subsample ,
            'subsample': self.subsample,
            'min_samples_split': self.min_samples_split,
            'min_samples_leaf': self.min_samples_leaf,
            'max_depth': self.max_depth,
            'random_state': self.random_state,
            'max_features': self.max_features,
            'verbose': self.verbose,
            'max_leaf_nodes': self.max_leaf_nodes,
            'warm_start': self.warm_start}

def set_params(self, **params):
    """Sets the parameters of this estimator.
            # Arguments
                **params: Dictionary of parameter names mapped to their values.
            # Returns
                self
    """
    for parameter, value in params.items():
        setattr(self, parameter, value)
    return self

my calling method is that gscv = GridSearchCV(pyltr.models.LambdaMART(), params_lst, scoring=pyltr.metrics.AUCROC.calc_mean, n_jobs=1, cv=5, verbose=1) gscv.fit(training_set[1], training_set[2], groups=training_set[0]) the training_set is a 3-tuple including training_qids, training_data and training_labels in sequence, which are all arrays. Looking forword your reply, tks vaery much!!!

jma127 commented 5 years ago

It seems that the extra qids parameter is not supported by this grid search class.