salaniz / pycocoevalcap

Python 3 support for the MS COCO caption evaluation tools
Other
293 stars 82 forks source link

why it is so slow to compute the meteor score? #11

Open jianguda opened 3 years ago

jianguda commented 3 years ago

Hi, @salaniz. I compute meteor and rouge scores but I find it is rather slow to wait for the computation result of Meteor scores. Could you please tell me why? Thanks!

Here is the code for reproduction, if it helps.

from pycocoevalcap.meteor.meteor import Meteor
from pycocoevalcap.rouge.rouge import Rouge

def evaluate_coco(ref_data, hyp_data):
    scorer_meteor = Meteor()
    scorer_rouge = Rouge()
    ref_data = [[ref_datum] for ref_datum in ref_data]
    hyp_data = [[hyp_datum] for hyp_datum in hyp_data]
    ref = dict(zip(range(len(ref_data)), ref_data))
    hyp = dict(zip(range(len(hyp_data)), hyp_data))

    print("coco meteor score ...")
    coco_meteor_score = scorer_meteor.compute_score(ref, hyp)[0]
    print("coco rouge score ...")
    coco_rouge_score = float(scorer_rouge.compute_score(ref, hyp)[0])
    return coco_meteor_score, coco_rouge_score

def main():
    ref_data = ['there is a cat on the mat']
    hyp_data = ['the cat is on the mat']
    evaluate_coco(ref_data, hyp_data)
Anurag14 commented 2 years ago

Hi, I also find the same issue. The meteor computation in my case is taking hours. Did you find the reason? When I tried to kill the code, I get the following stack trace.

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Input In [10], in <cell line: 22>()
     20             final_scores[method] = score
     21     return final_scores 
---> 22 calc_scores(corpus, references)

Input In [10], in calc_scores(ref, hypo)
     13 final_scores = {}
     14 for scorer, method in scorers:
---> 15     score, scores = scorer.compute_score(ref, hypo)
     16     if type(score) == list:
     17         for m, s in zip(method, score):

File ~/anaconda3/envs/gpt/lib/python3.9/site-packages/pycocoevalcap/meteor/meteor.py:40, in Meteor.compute_score(self, gts, res)
     37     stat = self._stat(res[i][0], gts[i])
     38     eval_line += ' ||| {}'.format(stat)
---> 40 self.meteor_p.stdin.write('{}\n'.format(eval_line).encode())
     41 self.meteor_p.stdin.flush()
     42 for i in range(0,len(imgIds)):

KeyboardInterrupt: 
salaniz commented 1 year ago

Calculating the meteor score does take some time, but it shouldn't take hours. Can you try to run the example/coco_eval_example.py script and report your runtime?

Of course, it scales with larger datasets, but even on the whole COCO validation set, evaluating all metrics should not take more than a couple of minutes.

EDIT: @jianguda timing your code, it took around 8.6 seconds on my machine.

hoangthangta commented 1 year ago

The problem from NLTK or its sub-components may make METEOR stuck and take hours to process.

Maybe your computer can not download one of these packages:

[nltk_data] Downloading package wordnet to [nltk_data] C:\Users\xx\AppData\Roaming\nltk_data... [nltk_data] Package wordnet is already up-to-date! [nltk_data] Downloading package punkt to [nltk_data] C:\Users\xx\AppData\Roaming\nltk_data... [nltk_data] Package punkt is already up-to-date! [nltk_data] Downloading package omw-1.4 to [nltk_data] C:\Users\xx\AppData\Roaming\nltk_data... [nltk_data] Package omw-1.4 is already up-to-date!

or even word_tokenizer of NLTK makes the process slow.