AmenRa / ranx

⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
https://amenra.github.io/ranx
MIT License
427 stars 23 forks source link

Why ranx is too slow in this simple example? #23

Closed celsofranssa closed 2 years ago

celsofranssa commented 2 years ago
from ranx import Qrels
from ranx import Run
from ranx import evaluate

qrels_dict = {
    "text_1": {
        "label_1": 1
    },

    "text_2":{
        "label_2": 1,
    }
}

qrels = Qrels(qrels_dict, name="testing")

run_dict = {
    "text_1": {
        "label_1": 1,
        "label_2": 0.9,
        "label_3": 0.8,
        "label_4": 0.7,
        "label_5": 0.6,
        "label_6": 0.5,
        "label_7": 0.4,
        "label_8": 0.3,
        "label_9": 0.2,
        "label_10": 0.1,
    },

    "text_2": {
        "label_1": 0.9,
        "label_2": 1,
        "label_3": 0.8,
        "label_4": 0.7,
        "label_5": 0.6,
        "label_6": 0.5,
        "label_7": 0.4,
        "label_8": 0.3,
        "label_9": 0.2,
        "label_10": 0.1,
    },
}

run = Run(run_dict, name="bm25")

evaluate(qrels, run, ["mrr@1", "mrr@5", "mrr@10"])
CPU times: user 55.2 s, sys: 467 ms, total: 55.7 s
Wall time: 56.8 s

This behavior can be reproduced in this Google Colab Notebook 1 and also in this Google Colab Notebook 2 (spent time per steps).

celsofranssa commented 2 years ago

Wow, it is taking forever in a real example with about 60k queries! Am I missing something?

AmenRa commented 2 years ago

Hi, and thanks for your interest in ranx!

I think you are missing that all numba-based ranx functions need to be compiled the first time you use them (there is a disclaimer on top of each of my notebooks about that). Also, Google Colab is very slow in compiling them.

If you run your notebook again (without reloading Colab), you should notice an extremely lower computation time.

Unfortunately, I suspect you must recompile ranx functionalities every time you start a new Colab instance. On your local machine, the compiled functions should be automatically stored for future usage by numba.

If you are stuck with Colab, I suggest you compile the functions with toy examples before using them with real-world data. You should be absolutely fine then with 60k queries, especially with MRR, which is very optimized.

On my local machine, a four years old MBP, I get these execution times for 1M queries with 100 results each:

%%time
evaluate(qrels, run, ["mrr@1", "mrr@5", "mrr@10"])
CPU times: user 11.6 s, sys: 38.6 ms, total: 11.6 s
Wall time: 1.05 s
%%time
evaluate(qrels, run, ["map", "mrr", "ndcg"])
CPU times: user 25.8 s, sys: 73.8 ms, total: 25.9 s
Wall time: 2.3 s

Hope this answer your question.

Please, consider giving ranx a star if you like it!

Best,

Elias

celsofranssa commented 2 years ago

Hi @AmenRa,

That was the case. Thank you for your quick answer. And it will be a pleasure to give ranx a star.

milyenpabo commented 9 months ago

Hi All,

I ran into the same problem. I'm not using Colab or any kind of notebook, but run my eval as a Python script (will be part of an eval pipeline). For each execution, instantiating the Qrel objects takes a looooong time, even for very a tiny eval set. For a dict of ~10 entries, the Qrel object creation takes 10-20 seconds on a beefy machine.

Is there a way to speed this up? E.g., @AmenRa you say:

"On your local machine, the compiled functions should be automatically stored for future usage by numba."

I'm afraid this is not happening. Any hints on how to check/fix this?

AmenRa commented 9 months ago

@milyenpabo Have you already tried with a dummy Qrels? Could you please post a sample of your specific Qrels without any modification?

milyenpabo commented 9 months ago

Thanks @AmenRa for picking this up quickly. I distilled a minimal example:

#!/usr/bin/env python3

from logger import log
from ranx import Qrels
import time

qrels_dict = {}
qrels_dict['test-query'] = {
    'word0' : 1,
    'word1' : 1,
    'word2' : 1,
    'word3' : 1,
    'word4' : 1,
    'word5' : 1,
    'word6' : 1,
    'word7' : 1,
    'word8' : 1,
    'word9' : 1
}

t = time.time()
log.info('Crearting a small Qrels object.')
qrels = Qrels(qrels_dict, name='Test')
log.info(f'Qrels object created in {time.time() - t:.2f} seconds.')

This program runs for roughly 17 seconds, the output is:

[INFO] 2023-11-14 20:30:44.025 generated new fontManager
[INFO] 2023-11-14 20:30:44.554 Crearting a small Qrels object.
[INFO] 2023-11-14 20:31:01.500 Qrels object created in 16.95 seconds.

I'm using ranx-0.3.18.

I also ran the above program with DEBUG logs, and from that, it seems that indeed some compilation-related things are eating up the 17 seconds. I attach the log file (7 MB, 64k lines):

ranx.log

I suspect I'm missing some basic numba knowledge here. I appreciate if you can provide any hints on how to fix this issue.

milyenpabo commented 9 months ago

Ok, I'm reading up on Numba a bit, and I found the root cause.

  1. I found an option to disable JIT compilation:

https://numba.readthedocs.io/en/stable/user/troubleshoot.html#disabling-jit-compilation

I tried it, and my test program gets significantly faster:

[INFO] 2023-11-14 20:50:48.255 generated new fontManager
[INFO] 2023-11-14 20:50:48.785 Crearting a small Qrels object.
[INFO] 2023-11-14 20:50:48.785 Qrels object created in 0.00 seconds.

So, this is good for a quick fix. Although, this way I guess we lose the benefits of Numba for larger eval sets? So, a less pressing question is: is there a way to compile once and reuse for subsequent runs?

  1. I found the cache=true option in the Numba docs, precisely to allow for reusing the numba-compiled code:

https://numba.readthedocs.io/en/stable/user/faq.html#there-is-a-delay-when-jit-compiling-a-complicated-function-how-can-i-improve-it

Checking the ranx source, it seems it uses the cache=True option (most of the time):

https://github.com/search?q=repo%3AAmenRa%2Franx+jit&type=code

  1. I realized I left out a seemingly minor detail from my previous post: I run the program in a Docker container. So, it might be that every time the program terminates and the container is stopped and removed, I just lose the Numba cache with it? After a bit of digging, I found a way to specify the cache directory:

https://numba.readthedocs.io/en/stable/reference/envvars.html#numba-envvars-caching

Interestingly, the default should have worked well, because I'm volume mounting the program directory from the host... then, I realized that I'm mounting the directory in read-only mode, so the Numba cache cannot be written at all... Removing the read-only flag from the volume mount fixes the entire issue:

[INFO] 2023-11-14 21:20:37.417 generated new fontManager
[INFO] 2023-11-14 21:20:37.943 Crearting a small Qrels object.
[INFO] 2023-11-14 21:20:38.781 Qrels object created in 0.84 seconds.

--

So, long story short: things should have worked out of the box, except they didn't... I'll leave this pitfall here, in case someone can learn from it later.

AmenRa commented 9 months ago

Hi, thanks for the information and debugging effort. I bet it will be of help to other people. If you like ranx, please give it a star. Thank you!