joaopalotti / trectools

A simple toolkit to process TREC files in Python.
https://pypi.python.org/pypi/trectools
BSD 3-Clause "New" or "Revised" License
163 stars 32 forks source link

Quesstion about how to fuse multiple ranking results (please help) #43

Closed BingliangLi closed 1 year ago

BingliangLi commented 1 year ago

Great library, thanks for all the excellent work!

I'm trying to fuse 5 CSV files generated by 5 ranking models, after reading the documentation(example 6) I still don't know how to achieve it, could you please help me with it?

For example, file model_1.csv:

search_id,item_id
1,45
1,3
1,4
1,90
2,5
2,54
2,76

and file model_2.csv:

search_id,item_id
1,45
1,4
1,3
1,78
2,5
2,93
2,54

Note: different models may return a different set of item_id for an individual search,(e.g., item_id 90 appears in model_1 for search_id 1, but not in model_2 for search_id 1), and every model has a validation score(NDCG), does the val score help? How should I use it(as weight maybe)?

Could you provide some example code to show how I can achieve this(how to read the csv files as a TrecRun and fuse them, using the validation score as weight if it's possible)? And what kind of fusion is appropriate for this kind of task? (It's about fusing the ranking of the results of a search of hotels)

BingliangLi commented 1 year ago

I achieved this using ranx, the issue is now closed.