AmenRa / ranx

⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍

https://amenra.github.io/ranx

MIT License

438 stars 24 forks source link

add an option to disable sort_dict_of_dict_by_value when adding results to a run #9

Closed PaulLerner closed 2 years ago

PaulLerner commented 2 years ago

Hi -- guy with the weird feature requests here :sweat_smile: --

Motivation

You don’t want to ask, but, I have some use case where all the documents returned by my system have the same score, however the order matters! And, when you add_and_sort documents to a run, you end up applying sort_dict_of_dict_by_value, which might reverse the order or completely shuffle the order of document ids:

In [1]: from ranx import Qrels, Run, evaluate

In [2]: run = Run()
   ...: run.add_multi(
   ...:     q_ids=["q_1", "q_2"],
   ...:     doc_ids=[
   ...:         ["doc_12", "doc_23", "doc_25", "doc_36", "doc_32", "doc_35"],
   ...:         ["doc_12", "doc_11", "doc_25", "doc_36", "doc_2",  "doc_35"],
   ...:     ],
   ...:     scores=[
   ...:         [0.9, 0.9, 0.9, 0.9, 0.9, 0.9],
   ...:         [0.9, 0.9, 0.9, 0.9, 0.9, 0.9],
   ...:     ],
   ...: )
In [3]: list(run.run['q_1'].keys())
Out[3]: ['doc_35', 'doc_32', 'doc_36', 'doc_25', 'doc_23', 'doc_12']

Solution

Obviously, my system could add a slightly negative number to preserve the order of documents, however, this is more of a pain to me than commenting this line.

The request

Would you be be willing to add an option to disable sort_dict_of_dict_by_value when calling add_multi?

Thanks for the quick response on my other issues :)

AmenRa commented 2 years ago

Hi Paul,

The rationale behind forcing sorting is to prevent the users to forget about it, which could cause a wrong evaluation.

I thought about adding an option to avoid sorting to add_multi to avoid useless computation. You could add queries to your Run / Qrels by batch, causingranx to perform sorting even when it's not needed. Because of that, I suggest using .from_dict to create Run / Qrels at the moment.

However, your problem poses a question about evaluating your lists as they are not ranked. If you are sure everything is fine with your data/model, you should manage the issue for your specific case. Otherwise, you could run into reproducibility issues, in my opinion.

Sorry if what I'm about to say seems obvious. If you have a sorted list of document IDs without meaningful scores, you could generate those as simple as follows:

scores=[s for s in range(len(doc_ids))][::-1]

It seems pretty feasible to me. What do you think?

Best,

Elias

PaulLerner commented 2 years ago

Hi,

Thanks for your answer. I won’t get into the details but my use case is actually a little bit more tricky than this.

I’ll consider using from_dict!

AmenRa commented 2 years ago

Mind that from_dictstill triggers sorting.

PaulLerner commented 2 years ago

Oh, ok, I misunderstood your first answer. So are you still considering

adding an option to avoid sorting to add_multi to avoid useless computation

AmenRa commented 2 years ago

I am, but I will probably make changes that do not solve your issue. My idea is to postpone the sorting operation to the first time a Qrels or Run is used for evaluation, following the lazy evaluation paradigm, but not to make sorting completely optional. As I told you before, I want ranx to take care of everything so that the user doesn't have to worry about sorting and other operations.

PaulLerner commented 2 years ago

Ok, I understand, thanks for your quick answers :)