RUCAIBox / RecBole

A unified, comprehensive and efficient recommendation library
https://recbole.io/
MIT License
3.37k stars 606 forks source link

[🐛BUG] XGBoost Evaluation Error #1998

Closed pintonos closed 7 months ago

pintonos commented 7 months ago

Hi, running XGBoost or LightGBM leads to an IndexError as shown in the attached error output. Training itself seems to run fine, but the evaluation after training leads to the error.

Any ideas? Thanks for the help!

[0] train-auc:0.63755   train-logloss:0.29392   valid-auc:0.71442   valid-logloss:0.30886
[50]    train-auc:0.74778   train-logloss:0.24132   valid-auc:0.74778   valid-logloss:0.27760
[99]    train-auc:0.78038   train-logloss:0.23128   valid-auc:0.75795   valid-logloss:0.27920
Traceback (most recent call last):
  File "C:/Program Files/JetBrains/PyCharm 2023.2.4/plugins/python/helpers/pydev/pydevd.py", line 1534, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2023.2.4\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:\Users\Andreas\Documents\femble\recommender-research\recommender.py", line 17, in <module>
    run_recbole(model='XGBoost', dataset='cycle-label', config_file_list=['configs/general.yaml', 'configs/dataset_label.yaml', 'configs/xgboost.yaml'])
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\quick_start\quick_start.py", line 148, in run_recbole
    best_valid_score, best_valid_result = trainer.fit(
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\trainer\trainer.py", line 1010, in fit
    valid_score, valid_result = self._valid_epoch(valid_data)
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\trainer\trainer.py", line 980, in _valid_epoch
    valid_result = self.evaluate(valid_data, load_best_model=False)
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\trainer\trainer.py", line 1144, in evaluate
    result = self.evaluator.evaluate(self.eval_collector.get_data_struct())
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\evaluator\evaluator.py", line 39, in evaluate
    metric_val = self.metric_class[metric].calculate_metric(dataobject)
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\evaluator\metrics.py", line 154, in calculate_metric
    pos_index, pos_len = self.used_info(dataobject)
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\evaluator\base_metric.py", line 63, in used_info
    rec_mat = dataobject.get("rec.topk")
  File "C:\Users\Andreas\.virtualenvs\seq-rec-jKWz98TR\lib\site-packages\recbole\evaluator\collector.py", line 38, in get
    raise IndexError("Can not load the data without registration !")
IndexError: Can not load the data without registration !
zhengbw0324 commented 7 months ago

@pintonos My guess is that you are using a ranking-related metric like NDCG. XGboost currently only supports auc, logloss and other metrics that have nothing to do with rankings.

pintonos commented 7 months ago

Yes, this was the issue, thanks for the help!