embeddings-benchmark / arena

Code for the MTEB Arena
https://hf.co/spaces/mteb/arena
14 stars 6 forks source link

Track time people take to vote #20

Open Muennighoff opened 1 month ago

Muennighoff commented 1 month ago

I think we should track the time after submission until people vote. This should be a pretty good quality indicator, as some people will likely spam / try to break it by submitting a lot. We can try to filter all of these votes out by filtering for time from submission until they cast their vote. cc @orionw as this would involve small changes in the results file on the hub, wdyt?

Muennighoff commented 1 month ago

Actually I think we can already compute this via the tstamp field? I.e. subtract the tstamp of the 0_conv_id when it's logged to battle with a vote vs when it's logged to individual with the data?

KennethEnevoldsen commented 1 month ago

Yeah, at least we should be able to detect "spammy" voters. We could also detect unreliable voters in other ways (e.g., their preferences consistently do not align with other voters).

orionw commented 1 month ago

Actually I think we can already compute this via the tstamp field? I.e. subtract the tstamp of the 0_conv_id when it's logged to battle with a vote vs when it's logged to individual with the data?

Good point, if we match with the individual we should be able to compute this. We could also extend the data to add the field explicitly but either way works.