usnistgov / trec_eval

Evaluation software used in the Text Retrieval Conference
235 stars 48 forks source link

Use of float types #34

Closed isoboroff closed 9 months ago

isoboroff commented 2 years ago

This was reported by Fernando Diaz:

you may want to increase the precision of sim in trec_eval to double.  

typedef struct {                    /* For each retrieved document result */
    char *docno;                       /* document id */
    float sim;                         /* score */
} TEXT_RESULTS;

I get different results when I switch sim to double and evaluate ims_wcs_ap_uf on core 2017. 
casting the double from atof to float in the struct cause issues on macos.  
let me know if you see the difference too.  

This may have broader implications across the codebase.

AmenRa commented 1 year ago

Hi,

My name is Elias Bassani, and I am the author of ranx (a Python library for ranking evaluation and comparison).

During the implementation of my library, whose metrics are checked against trec_eval for correctness, I noticed trec_eval sometimes misbehaves when document score differences are tiny (less than 10-8). I think this could be related to the issue above.

Here is a working example:

qrels

q1 0 d0 1

run

q1 Q0 d62 1  0.9752302810058760 Sys
q1 Q0 d25 2  0.9720277433347962 Sys
q1 Q0 d17 3  0.9425774560923942 Sys
q1 Q0 d24 4  0.9406931541311498 Sys
q1 Q0 d84 5  0.9394391812940595 Sys
q1 Q0 d3  6  0.9359824150337328 Sys
q1 Q0 d51 7  0.9316103082678681 Sys
q1 Q0 d0  8  0.9222621453752586 Sys
q1 Q0 d65 9  0.9222621427281577 Sys
q1 Q0 d8  10 0.9062275552229255 Sys

Output of trec_eval -m recip_rank qrels run : 0.1111 Expected result : 0.125

Simply replacing all float occurrences in the codebase with double solves the issue. Testing with make quicktest outputs Test succeeeded after replacement.

I can do a PR if you want.

Best,

Elias

isoboroff commented 1 year ago

A patch would be welcome, a patch against the 10.0-dev branch even moreso. I've been wary of touching this one. Does the 'make test' still pass?

AmenRa commented 1 year ago

Same results on 10.0-dev branch (make quicktest outputs Test succeeeded).

I am currently on MacOS, I will check everything works fine on Windows and Ubuntu in the next few days.

Anything else you want me to check before opening a PR?

AmenRa commented 1 year ago

PS: make test outputs make: Nothing to be done for 'test' on both branches.

isoboroff commented 9 months ago

This is resolved in the 10.0-rc branch