Closed celsofranssa closed 1 year ago
I think it's better if users do that on their own so that they are aware of what's happening. Also, if you do not that upfront you will end up slowing down the optimisation process, as you will search also for queries for which you do not have qrels.
However, how to integrate this filter during autotune?
Even during autotune, since the retrieve results depend on the queries and the training collection, it is unlikely that qrels contains all possible keys from run (although ideally, it will have an intersection with the run).
You don't need the true relevance value for each query-doc tuple. As the error says, the query ids do not match. Meaning that the provided qrels has more/less/different query ids than the queries for which the run was computed.
I see. Since it is costly to obtain relevance feedback, I only have it for a general case. So when I try to autotune the retriever for different slices of data, there is no way to guarantee that the keys in qrels and run are identical, only that they intersect.
Anyway, thank you.
Well, can't you extract the intersection before tuning?
Closing for inactivity.
I guess this issue happens because in
qrels
, there are not all possible scores for all possible results in therun
. Wouldn't it be interesting to filter therun
dictionary for only the evaluated cases that occur inqrels
?