--complete_rel_info_wanted:
-c: Average over the complete set of queries in the relevance judgements
instead of the queries in the intersection of relevance judgements
and results. Missing queries will contribute a value of 0 to all
evaluation measures (which may or may not be reasonable for a
particular evaluation measure, but is reasonable for standard TREC
measures.) Default is off.
Although the default in trec_eval is off, I think it would be prudent to default this value to on (and maybe give the user an option to turn it off). Without this, a user may accidentally average over an incomplete set of queries, e.g., if their engine doesn't return any results for a given query.
It doesn't look like this is as simple as setting:
self->epi_.average_complete_flag = 1;
because the setting only affects trec_eval's averages, not the individual per-query scores. A fix could be modifying the run dict before sending it to the relevance assessor, adding in any missing queries pointing to empty dicts.
The
-c
option intrec_eval
does the following:Although the default in
trec_eval
is off, I think it would be prudent to default this value to on (and maybe give the user an option to turn it off). Without this, a user may accidentally average over an incomplete set of queries, e.g., if their engine doesn't return any results for a given query.It doesn't look like this is as simple as setting:
because the setting only affects
trec_eval
's averages, not the individual per-query scores. A fix could be modifying the run dict before sending it to the relevance assessor, adding in any missing queries pointing to empty dicts.