usnistgov / trec_eval

Evaluation software used in the Text Retrieval Conference
232 stars 49 forks source link

Negative relevance scores and -J #29

Closed seanmacavaney closed 8 months ago

seanmacavaney commented 3 years ago

I stumbled upon what appears to be a bug, in which documents with negative relevance scores are removed when using -J.

It's simple to reproduce:

qrels

Q1 0 D1 -1
Q1 0 D2 1

run

Q1 0 D1 -1 2 1 run
Q1 0 D2 1 1 2 run
$ trec_eval qrels run.1 -m P.1
P_1                     all 0.0000
$ trec_eval qrels run.1 -m P.1 -J
P_1                     all 1.0000
# I would expect the above to be 0.0000

I would expect negative relevance scores to be considered judged. For instance, in TREC WebTrack, -2 indicates the assessor regarded the page as "Junk".

It seems this is caused by the docno_info[i].rel >= 0 condition here: https://github.com/usnistgov/trec_eval/blob/master/form_res_rels.c#L219. Based on this condition, is this actually the desired behavior?

isoboroff commented 8 months ago

Historically, a qrel of -1 indicated pooled but not judged. I started using -2 to mark spam, not realizing this quirk of -J. Since -J is documented as "here be dragons" I'm mixed on whether to fix this or just document it.

seanmacavaney commented 8 months ago

Sounds fair to me!