Closed PaulLerner closed 2 years ago
Hi Paul,
The rationale behind forcing sorting is to prevent the users to forget about it, which could cause a wrong evaluation.
I thought about adding an option to avoid sorting to add_multi to avoid useless computation.
You could add queries to your Run
/ Qrels
by batch, causingranx
to perform sorting even when it's not needed.
Because of that, I suggest using .from_dict
to create Run
/ Qrels
at the moment.
However, your problem poses a question about evaluating your lists as they are not ranked. If you are sure everything is fine with your data/model, you should manage the issue for your specific case. Otherwise, you could run into reproducibility issues, in my opinion.
Sorry if what I'm about to say seems obvious. If you have a sorted list of document IDs without meaningful scores, you could generate those as simple as follows:
scores=[s for s in range(len(doc_ids))][::-1]
It seems pretty feasible to me. What do you think?
Best,
Elias
Hi,
Thanks for your answer. I won’t get into the details but my use case is actually a little bit more tricky than this.
I’ll consider using from_dict
!
Mind that from_dict
still triggers sorting.
Oh, ok, I misunderstood your first answer. So are you still considering
adding an option to avoid sorting to add_multi to avoid useless computation
?
I am, but I will probably make changes that do not solve your issue.
My idea is to postpone the sorting operation to the first time a Qrels
or Run
is used for evaluation, following the lazy evaluation paradigm, but not to make sorting completely optional.
As I told you before, I want ranx
to take care of everything so that the user doesn't have to worry about sorting and other operations.
Ok, I understand, thanks for your quick answers :)
Hi -- guy with the weird feature requests here :sweat_smile: --
Motivation
You don’t want to ask, but, I have some use case where all the documents returned by my system have the same score, however the order matters! And, when you
add_and_sort
documents to a run, you end up applyingsort_dict_of_dict_by_value
, which might reverse the order or completely shuffle the order of document ids:Solution
Obviously, my system could add a slightly negative number to preserve the order of documents, however, this is more of a pain to me than commenting this line.
The request
Would you be be willing to add an option to disable
sort_dict_of_dict_by_value
when callingadd_multi
?Thanks for the quick response on my other issues :)