[x] multiply scores by 100 and keep one decimal, e.g. 78.1 (@orionw not sure if this also works for followIR?). I believe anything more precise does not translate to a meaningful performance difference
[x] It might be ideal to move the "performance pr. task" to a separate table in a fold-down menu.
[x] We could also add highlighting of highest scores using styling elements
A couple of comments for readability:
Originally posted by @KennethEnevoldsen in https://github.com/embeddings-benchmark/mteb/issues/1312#issuecomment-2435013987