cvangysel / pytrec_eval

pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval.
http://ilps.science.uva.nl/
MIT License
282 stars 32 forks source link

TypeError: Expected object_relevance_per_qid dictionary and measures set. #52

Open guenthermi opened 3 months ago

guenthermi commented 3 months ago

I am getting this error when running some evalution task with MTEB.

It seems like it has something to do with pytrec. The error occurs in the shared library (pytrec_eval_ext.cpython-310-x86_64-linux-gnu.so).

Interestingly I copied over the shared library from a different machine with similar configuratiton and the same version of pytrec_eval installed and it seems to solve the issue. Nevertheless, it would be good to find out what is going wrong.

Here is the trace:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/mguenther/finetuner-large-scale-training/run_mteb.py", line 42, in <module>
    evaluation.run(
  File "/home/mguenther/finetuner-large-scale-training/venv/lib/python3.10/site-packages/mteb/evaluation/MTEB.py", line 398, in run
    raise e
  File "/home/mguenther/finetuner-large-scale-training/venv/lib/python3.10/site-packages/mteb/evaluation/MTEB.py", line 364, in run
    results, tick, tock = self._run_eval(
  File "/home/mguenther/finetuner-large-scale-training/venv/lib/python3.10/site-packages/mteb/evaluation/MTEB.py", line 260, in _run_eval
    results = task.evaluate(model, split, output_folder=output_folder, **kwargs)
  File "/home/mguenther/finetuner-large-scale-training/venv/lib/python3.10/site-packages/mteb/abstasks/AbsTaskRetrieval.py", line 273, in evaluate
    scores[hf_subset] = self._evaluate_subset(
  File "/home/mguenther/finetuner-large-scale-training/venv/lib/python3.10/site-packages/mteb/abstasks/AbsTaskRetrieval.py", line 311, in _evaluate_subset
    ndcg, _map, recall, precision = retriever.evaluate(
  File "/home/mguenther/finetuner-large-scale-training/venv/lib/python3.10/site-packages/mteb/evaluation/evaluators/RetrievalEvaluator.py", line 538, in evaluate
    evaluator = pytrec_eval.RelevanceEvaluator(
  File "/home/mguenther/finetuner-large-scale-training/venv/lib/python3.10/site-packages/pytrec_eval/__init__.py", line 59, in __init__
    super().__init__(query_relevance=query_relevance, measures=measures, relevance_level=relevance_level, judged_docs_only_flag=judged_docs_only_flag)
TypeError: Expected object_relevance_per_qid dictionary and measures set.

The RelevanceEvaluator is initialized in the following way:

evaluator = pytrec_eval.RelevanceEvaluator(
    qrels, {map_string, ndcg_string, recall_string, precision_string}
)

with qrles being:

defaultdict(<class 'dict'>, {'1': {'31715818': 1}, '3': {'14717500': 1}, '5': {'13734012': 1}, '13': {'1606628': 1}, '36': {'5152028': 1, '11705328': 1}, '42': {'18174210': 1}, '48': {'13734012': 1}, '49': {'5953485': 1}, '50': {'12580014': 1}, '51': {'45638119': 1}, '53': {'45638119': 1}, '54': {'49556906': 1}, '56': {'4709641': 1}, '57': {'4709641': 1}, '70': {'5956380': 1, '4414547': 1}, '72': {'6076903': 1}, '75': {'4387784': 1}, '94': {'1215116': 1}, '99': {'18810195': 1}, '100': {'4381486': 1}, '113': {'6157837': 1}, '115': {'33872649': 1}, '118': {'6372244': 1}, '124': {'4883040': 1}, '127': {'21598000': 1}, '128': {'8290953': 1}, '129': {'27768226': 1}, '130': {'27768226': 1}, '132': {'7975937': 1}, '133': {'38485364': 1, '6969753': 1, '17934082': 1, '16280642': 1, '12640810': 1}, '137': {'26016929': 1}, '141': {'6955746': 1, '14437255': 1}, '142': {'10582939': 1}, '143': {'10582939': 1}, '146': {'10582939': 1}, '148': {'1084345': 1}, '163': {'18872233': 1}, '171': {'12670680': 1}, '179': {'16322674': 1, '27123743': 1, '23557241': 1, '17450673': 1}, '180': {'16966326': 1}, '183': {'12827098': 1}, '185': {'18340282': 1}, '198': {'2177022': 1}, '208': {'13519661': 1}, '212': {'22038539': 1}, '213': {'13625993': 1}, '216': {'21366394': 1}, '217': {'21366394': 1}, '218': {'21366394': 1}, '219': {'21366394': 1}, '230': {'3067015': 1}, '232': {'10536636': 1}, '233': {'4388470': 1}, '236': {'4388470': 1}, '237': {'4942718': 1}, '238': {'2251426': 1}, '239': {'14079881': 1}, '248': {'1568684': 1}, '249': {'1568684': 1}, '261': {'1122279': 1, '10697096': 1}, '268': {'970012': 1}, '269': {'970012': 1}, '274': {'11614737': 1}, '275': {'4961038': 1, '14241418': 1, '14819804': 1}, '279': {'14376683': 1}, '294': {'10874408': 1}, '295': {'20310709': 1}, '298': {'39381118': 1}, '300': {'3553087': 1}, '303': {'4388470': 1}, '312': {'6173523': 1}, '314': {'4347374': 1}, '324': {'2014909': 1}, '327': {'17997584': 1}, '338': {'23349986': 1}, '343': {'7873737': 1, '5884524': 1}, '350': {'16927286': 1}, '354': {'8774475': 1}, '362': {'38587347': 1}, '380': {'19005293': 1}, '384': {'13770184': 1}, '385': {'9955779': 1, '9767444': 1}, '386': {'16495649': 1}, '388': {'1148122': 1}, '399': {'791050': 1}, '410': {'14924526': 1}, '411': {'14924526': 1}, '415': {'6309659': 1}, '421': {'11172205': 1}, '431': {'28937856': 1}, '436': {'14637235': 1}, '437': {'18399038': 1}, '439': {'4423559': 1}, '440': {'4423559': 1}, '443': {'10165258': 1}, '452': {'12804937': 1, '464511': 1}, '475': {'18678095': 1}, '478': {'14767844': 1}, '491': {'56893404': 1}, '501': {'17930286': 1}, '502': {'13071728': 1}, '507': {'30774694': 1}, '508': {'13980338': 1}, '513': {'13230773': 1}, '514': {'16256507': 1}, '516': {'29564505': 1}, '517': {'15663829': 1}, '521': {'34873974': 1}, '525': {'13639330': 1}, '527': {'3863543': 1}, '528': {'5476778': 1}, '532': {'12991445': 1}, '533': {'12991445': 1}, '535': {'39368721': 1}, '536': {'16056514': 1}, '539': {'13282296': 1}, '540': {'11886686': 1, '25007443': 1}, '544': {'24221369': 1}, '549': {'9433958': 1}, '551': {'33499189': 1}, '552': {'1471041': 1}, '554': {'1049501': 1}, '560': {'40096222': 1}, '569': {'23460562': 1}, '575': {'10300888': 1}, '577': {'5289038': 1}, '578': {'8764879': 1}, '587': {'16999023': 1}, '589': {'10984005': 1}, '593': {'19675911': 1}, '597': {'12779444': 1, '36355784': 1, '25742130': 1}, '598': {'25742130': 1}, '613': {'9638032': 1}, '619': {'20888849': 1, '2565138': 1}, '623': {'17000834': 1}, '628': {'24512064': 1}, '636': {'24294572': 1}, '637': {'25649714': 1}, '641': {'5912283': 1, '31554917': 1}, '644': {'13619127': 1}, '649': {'12789595': 1}, '659': {'1215116': 1}, '660': {'1215116': 1}, '674': {'2095573': 1}, '684': {'4942718': 1}, '690': {'18750453': 1}, '691': {'10991183': 1}, '692': {'24088502': 1}, '693': {'24088502': 1}, '700': {'4350400': 1}, '702': {'4350400': 1}, '715': {'18421962': 1}, '716': {'18421962': 1}, '718': {'17587795': 1}, '721': {'1834762': 1}, '723': {'5531479': 1}, '727': {'7521113': 1}, '728': {'7521113': 1, '36444198': 1}, '729': {'26851674': 1}, '742': {'32159283': 1}, '743': {'32159283': 1}, '744': {'8460275': 1}, '756': {'2831620': 1}, '759': {'1805641': 1}, '768': {'6421792': 1}, '770': {'15476777': 1}, '775': {'32275758': 1}, '781': {'24338780': 1}, '783': {'40632104': 1}, '784': {'2356950': 1}, '785': {'12471115': 1}, '793': {'8551160': 1}, '800': {'22543403': 1}, '805': {'22180793': 1}, '808': {'36606083': 1}, '811': {'19799455': 1}, '814': {'33387953': 1}, '820': {'8646760': 1}, '821': {'8646760': 1}, '823': {'15319019': 1}, '830': {'1897324': 1}, '831': {'1897324': 1}, '832': {'30303335': 1}, '834': {'5483793': 1}, '837': {'15928989': 1}, '839': {'1469751': 1}, '845': {'17741440': 1}, '847': {'16787954': 1}, '852': {'13843341': 1}, '859': {'1982286': 1}, '870': {'195689316': 1}, '873': {'1180972': 1, '19307912': 1, '27393799': 1, '29025270': 1, '3315558': 1}, '879': {'8426046': 1}, '880': {'8426046': 1}, '882': {'14803797': 1}, '887': {'18855191': 1}, '903': {'10648422': 1}, '904': {'7370282': 1}, '907': {'6923961': 1}, '911': {'11254556': 1}, '913': {'3203590': 1}, '914': {'3203590': 1}, '921': {'1642727': 1}, '922': {'17077004': 1}, '936': {'5483793': 1}, '956': {'12956194': 1}, '957': {'123859': 1}, '960': {'8780599': 1}, '967': {'2119889': 1, '8997410': 1}, '971': {'46695481': 1, '27873158': 1, '28617573': 1, '9764256': 1}, '975': {'5304891': 1}, '982': {'2988714': 1}, '985': {'6828370': 1}, '993': {'16472469': 1}, '1012': {'9745001': 1}, '1014': {'6277638': 1}, '1019': {'11603066': 1}, '1020': {'9433958': 1}, '1021': {'9433958': 1}, '1024': {'5373138': 1}, '1029': {'13923140': 1, '13940200': 1, '11899391': 1}, '1041': {'25254425': 1, '16626264': 1}, '1049': {'12486491': 1}, '1062': {'20381484': 1}, '1086': {'39281140': 1}, '1088': {'37549932': 1}, '1089': {'17628888': 1}, '1099': {'7662206': 1}, '1100': {'7662206': 1}, '1104': {'3898784': 1}, '1107': {'20532591': 1}, '1110': {'13770184': 1}, '1121': {'4456756': 1}, '1130': {'17997584': 1}, '1132': {'33499189': 1, '9283422': 1}, '1137': {'33370': 1}, '1140': {'12009265': 1}, '1144': {'10071552': 1}, '1146': {'13906581': 1}, '1150': {'11369420': 1}, '1163': {'15305881': 1}, '1175': {'31272411': 1}, '1179': {'31272411': 1}, '1180': {'31272411': 1}, '1185': {'16737210': 1}, '1187': {'52873726': 1}, '1191': {'30655442': 1}, '1194': {'11419230': 1}, '1196': {'25649714': 1}, '1197': {'25649714': 1}, '1199': {'16760369': 1}, '1200': {'3441524': 1}, '1202': {'3475317': 1}, '1204': {'31141365': 1}, '1207': {'18909530': 1}, '1213': {'14407673': 1}, '1216': {'24142891': 1}, '1221': {'19736671': 1}, '1225': {'9650982': 1}, '1226': {'13777138': 1}, '1232': {'13905670': 1}, '1241': {'4427392': 1}, '1245': {'7662395': 1}, '1259': {'24341590': 1}, '1262': {'44172171': 1}, '1266': {'37480103': 1}, '1270': {'13900610': 1}, '1271': {'13768432': 1}, '1272': {'17081238': 1}, '1273': {'11041152': 1}, '1274': {'12428814': 1, '27731651': 1, '4406819': 1}, '1278': {'11335781': 1}, '1279': {'11335781': 1}, '1280': {'4387784': 1}, '1281': {'4387784': 1}, '1282': {'23649163': 1}, '1290': {'4687948': 1}, '1292': {'56893404': 1}, '1298': {'11718220': 1}, '1303': {'12631697': 1}, '1316': {'27910499': 1}, '1319': {'16284655': 1}, '1320': {'16284655': 1}, '1332': {'5304891': 1}, '1335': {'27910499': 1}, '1336': {'27910499': 1}, '1337': {'20231138': 1}, '1339': {'15482274': 1}, '1344': {'9559146': 1}, '1352': {'12885341': 1}, '1359': {'11614737': 1}, '1362': {'8290953': 1}, '1363': {'8290953': 1}, '1368': {'2425364': 1}, '1370': {'2425364': 1}, '1379': {'16322674': 1, '27123743': 1, '23557241': 1, '17450673': 1}, '1382': {'17755060': 1}, '1385': {'306006': 1}, '1389': {'23895668': 1}, '1395': {'17717391': 1}})

and {map_string, ndcg_string, recall_string, precision_string}:

{'ndcg_cut.1,3,5,10,20,100,1000', 'recall.1,3,5,10,20,100,1000', 'map_cut.1,3,5,10,20,100,1000', 'P.1,3,5,10,20,100,1000'}

Both machines have pytrec_eval verison 0.5, setuptools 59.6.0, Python 3.10.12, Ubuntu 22.04, and gcc 11.4.0

seanmacavaney commented 3 months ago

I'm not able to reproduce:

>>> import pytrec_eval
>>> from collections import defaultdict
>>> q = defaultdict(dict)
>>> q['test']['1'] = 1
>>> measures = {'ndcg_cut.1,3,5,10,20,100,1000', 'recall.1,3,5,10,20,100,1000', 'map_cut.1,3,5,10,20,100,1000', 'P.1,3,5,10,20,100,1000'}
>>> pytrec_eval.RelevanceEvaluator(q, measures)
<pytrec_eval.RelevanceEvaluator object at 0x7f9095f3bac0>

Can you try to make a minimal reproducible example?

guenthermi commented 2 months ago

Sorry for the late reply. I tried to run your code and got the same error again. However, I noticed when reinstalling pytrec_eval that wheel was not installed. Therefore, it states:

Using legacy 'setup.py install' for pytrec_eval, since package 'wheel' is not installed.
Installing collected packages: pytrec_eval
  Running setup.py install for pytrec_eval ...

After installing wheel and installing pytrec_eval again, the code works as expected.