computing METEOR fails - Githubissues

dhansmair commented 2 years ago

Hi! when running the example script, I get the following error:

(venv2) *******% python coco_eval_example.py
loading annotations into memory...
Done (t=0.22s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.19s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 2492309 tokens at 1957824,35 tokens per second.
PTBTokenizer tokenized 381324 tokens at 1401808,46 tokens per second.
setting up scorers...
Downloading stanford-corenlp-3.6.0 for SPICE ...
Progress: 384.5M / 384.5M (100.0%)
Extracting stanford-corenlp-3.6.0 ...
Done.
computing Bleu score...
{'testlen': 313500, 'reflen': 368039, 'guess': [313500, 272996, 232492, 191988], 'correct': [153357, 45146, 12441, 3457]}
ratio: 0.851811900369252
Bleu_1: 0.411
Bleu_2: 0.239
Bleu_3: 0.137
Bleu_4: 0.079
computing METEOR score...
Traceback (most recent call last):
  File "coco_eval_example.py", line 22, in <module>
    coco_eval.evaluate()
  File "/home/stud/*****/venv2/lib/python3.8/site-packages/pycocoevalcap/eval.py", line 53, in evaluate
    score, scores = scorer.compute_score(gts, res)
  File "/home/stud/*****/venv2/lib/python3.8/site-packages/pycocoevalcap/meteor/meteor.py", line 43, in compute_score
    scores.append(float(self.meteor_p.stdout.readline().strip()))
ValueError: could not convert string to float: ''

as you can see, the error happens in meteor.py. Could you maybe give an advice on this problem? Best, David

aliciaviernes commented 2 years ago

I have exactly the same issue. Have you found a solution?

dhansmair commented 2 years ago

unfortunately not, in the end I just removed the meteor evaluation from the code

Debolena7 commented 1 year ago

Hi, I faced a similar issue. You may change the code in meteor.py as follows:

def compute_score(self, gts, res):
    assert(gts.keys() == res.keys())
    imgIds = gts.keys()
    scores = []

    eval_line = 'EVAL'
    self.lock.acquire()
    for i in imgIds:
        assert(len(res[i]) == 1)
        stat = self._stat(res[i][0], gts[i])
        eval_line += ' ||| {}'.format(stat)

    self.meteor_p.stdin.write('{}\n'.format(eval_line).encode())
    self.meteor_p.stdin.flush()
    for i in range(0,len(imgIds)):
        scores.append(float(self.meteor_p.stdout.readline().strip()))
    score = float(self.meteor_p.stdout.readline().strip())
    self.lock.release()

    return score, scores

def _stat(self, hypothesis_str, reference_list):
    # SCORE ||| reference 1 words ||| reference n words ||| hypothesis words
    hypothesis_str = hypothesis_str.replace('|||','').replace('  ',' ')
    score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str))
    self.meteor_p.stdin.write('{}\n'.format(score_line).encode())
    self.meteor_p.stdin.flush()
    raw = self.meteor_p.stdout.readline().decode().strip()
    numbers = [str(int(float(n))) for n in raw.split()]
    return ' '.join(numbers)

It solved the problem for me.

salaniz commented 1 year ago

Sorry for the late reply. I cannot reproduce this issue. Can you please try out the minimal conda environment below and tell me if the problem persists?

It would be easiest if you just run example/coco_eval_example.py in this repository from inside the example directory.

@not-hermione What exactly is the problem you found and fixed? Maybe it would be easiest if you open a pull request?

name: pycocoevalcap
channels:
  - conda-forge
  - defaults
dependencies:
  - openjdk=11.0.15
  - pip=22.3.1
  - python=3.11.0
  - pip:
      - pycocotools==2.0.6
      - pycocoevalcap==1.2

aliciaviernes commented 1 year ago

Hi, yes the problem persists even with @not-hermione 's fixes and your proposed conda environment.

Done (t=0.20s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 61268 tokens at 298309,87 tokens per second.
PTBTokenizer tokenized 10892 tokens at 71246,66 tokens per second.
setting up scorers...
computing Bleu score...
{'testlen': 9893, 'reflen': 9855, 'guess': [9893, 8893, 7893, 6893], 'correct': [5732, 2510, 1043, 423]}
ratio: 1.003855910705124
Bleu_1: 0.579
Bleu_2: 0.404
Bleu_3: 0.279
Bleu_4: 0.191
computing METEOR score...
Traceback (most recent call last):
  File "/Users/alikianagnostopoulou/Code/pycocoevalcap/example/coco_eval_example.py", line 21, in <module>
    coco_eval.evaluate()
  File "/opt/miniconda3/envs/pycocoevalcap/lib/python3.11/site-packages/pycocoevalcap/eval.py", line 53, in evaluate
    score, scores = scorer.compute_score(gts, res)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pycocoevalcap/lib/python3.11/site-packages/pycocoevalcap/meteor/meteor.py", line 43, in compute_score
    scores.append(float(self.meteor_p.stdout.readline().strip()))
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: b''

salaniz commented 1 year ago

I just realized that this problem is probably related to https://github.com/salaniz/pytorch-gve-lrcn/issues/9. Could you please check what your locale is set to and try this fix: https://github.com/salaniz/pytorch-gve-lrcn/issues/9#issuecomment-469235961

regnujAx commented 1 year ago

I think I fixed this (or a similar) problem with replacing the line self.meteor_p.stdin.write('{}\n'.format(eval_line).encode()) in compute_score() in meteor.py with self.meteor_p.stdin.write('{}\n'.format(eval_line).replace('.', ',').encode()). It seems that there was an OS (or locale, I'm not sure) depending problem with the separators in the string.

salaniz / pycocoevalcap

computing METEOR fails #15