Closed muneebable closed 1 year ago
Can you tell me which Python version you are using and what the value of self.meteor_p.stdout.readline().strip()
is before the error happens?
Hey, I am using python 3.5.6. Also I print the value of this expression and I got
Epoch [0/50], Step [310/317]
Loading and preparing results...
DONE (t=0.10s)
creating index...
index created!
==================================================cocoeval=========================================
<pycocoevalcap.eval.COCOEvalCap object at 0x7f07a2ee4e48>
<class 'pycocoevalcap.eval.COCOEvalCap'>
tokenization...
PTBTokenizer tokenized 2492307 tokens at 2682485,86 tokens per second.
PTBTokenizer tokenized 426774 tokens at 1468350,47 tokens per second.
setting up scorers...
computing Bleu score...
{'guess': [386271, 345767, 305263, 264759], 'testlen': 386271, 'correct': [248500, 112587, 45275, 18918], 'reflen': 3845
06}
ratio: 1.0045903054828766
Bleu_1: 0.643
Bleu_2: 0.458
Bleu_3: 0.314
Bleu_4: 0.217
computing METEOR score...
================
b''
============================
Traceback (most recent call last):
File "main.py", line 106, in <module>
score = val_dataset.eval(result, checkpoint_path)
File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/utils/data/coco_dataset.py", line 214, in eval
cocoEval.evaluate()
File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/eval.py", line 50, in evaluate
score, scores = scorer.compute_score(gts, res)
File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/meteor/meteor.py", line 46, in compute_score
scores.append(float(self.meteor_p.stdout.readline().strip()))
ValueError: could not convert string to float:
I don't know why it is getting b'' value
Try this, Maybe this will fix the bug: https://github.com/tylin/coco-caption/pull/35/files
Currently, I am not calculating Meteor score. I will try maybe after a week and will post it here if it works.
I can't reproduce this. Can you check what the value of eval_line
is in your case?
It should be something like this:
EVAL ||| 9.0 14.0 3.0 7.0 0.0 0.0 3.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 4.0 4.0 4.0
You can test if the meteor jar works on your system by navigating to the meteor directory (where meteor.py is located) and check that you can do the following:
java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm
(it should wait for standard input)You might want to backtrack why your string is empty by looking into the values that are fed into the _stat() function and by validating that the content of your score_line
is returning proper results with the same procedure as described above (evaluating the line directly by calling the meteor jar with java).
The raw values come from the data and the model itself, so there might be something wrong outside the meteor script.
Okay . I will have a look.
One more thing. I trained the whole model but CiDer score mentioned in paper is around 56.7. I am not getting that. How do you select the best checkpoint? because from checkpoint directory, i can't able to see the best checkpoint.
I am assuming you mean the GVE model on CUB. Try to tune the loss_lambda
parameter a bit to get better results, 0.2
should be a good starting point.
I am creating a symbolic link named best-ckpt.pth
pointing to the checkpoint with the highest validation performance. So in the checkpoint directory you should see this link. This might not be available if you are using Windows though. If that's the case, you could simply change that part of the code to either save it as a file instead of a link or simply save the name of the best checkpoint in a text file.
Okay I tried to print the values and this is what i got:
============================stat
10.0 11.0 5.0 7.0 2.0 2.0 4.0 4.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 5.0 7.0 7.0
stat================================
score line============================
SCORE ||| a small toaster oven sitting on top of a wooden table ||| toaster oven set up on table in a communal room ||| a toaster oven is sitting on top of a table in a classro
om ||| a toaster oven in a state of mild dis-assembly on a workbench ||| a toaster oven two people sitting at a table and a lady walking with a purse ||| a refrigerator with a
refrigerator and a microwave
============================stat
8.0 16.0 5.0 9.0 0.0 0.0 5.0 5.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.0 5.0 5.0
stat================================
================
b''
============================
Traceback (most recent call last):
File "main.py", line 106, in <module>
score = val_dataset.eval(result, checkpoint_path)
File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/utils/data/coco_dataset.py", line 214, in eval
cocoEval.evaluate()
File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/eval.py", line 50, in evaluate
score, scores = scorer.compute_score(gts, res)
File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/meteor/meteor.py", line 49, in compute_score
scores.append(float(self.meteor_p.stdout.readline().strip()))
ValueError: could not convert string to float:
print(self.meteor_p.stdout.readline().strip())
This line print b''
.
after it gives a float error on next line
Make sure you don't call self.meteor_p.stdout.readline().strip()
twice when you add a print statement. Store it in a temporary variable instead. Otherwise you will not evaluate what you are printing.
Can you please try to run java directly as I described above to verify that it works.
Which OS are you on?
Yeah, I called that only one time. I am using Ubuntu. Yeah I try to run java directly and got this:
(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$ java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm 9
EVAL ||| 9.0 14.0 3.0 7.0 0.0 0.0 3.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 4.0 4.0 9
Exception in thread "main" java.util.InputMismatchException
at java.util.Scanner.throwFor(Scanner.java:864) 9
at java.util.Scanner.next(Scanner.java:1485)
at java.util.Scanner.nextDouble(Scanner.java:2413)
at edu.cmu.meteor.scorer.MeteorStats.<init>(Unknown Source)
at Meteor.scoreStdio(Unknown Source) 9
at Meteor.main(Unknown Source)
(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$
It seems like you missed a number at the end of the string. Try again with the full line that I gave you above. Also try the content of eval_line
during your execution, because you probably want to verify that your pipeline works.
Still the same error. :(
(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$ java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm
EVAL ||| 9.0 14.0 3.0 7.0 0.0 0.0 3.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 4.0 4.0 4.0
Exception in thread "main" java.util.InputMismatchException
at java.util.Scanner.throwFor(Scanner.java:864)
at java.util.Scanner.next(Scanner.java:1485)
at java.util.Scanner.nextDouble(Scanner.java:2413)
at edu.cmu.meteor.scorer.MeteorStats.<init>(Unknown Source)
at Meteor.scoreStdio(Unknown Source)
at Meteor.main(Unknown Source)
(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor
Ok, then it seems to be a problem with the meteor-1.5.jar. Can you please check out the fixes that other people proposed here: https://github.com/tylin/coco-caption/issues/6
Yeah I try that also. Nothing seems to be working now.
Also, After running python main.py --model sc --dataset cub
. We should find the best-ckpt.pth in ./checkpoints/sc-cub-D<date>-T<time>-G<GPUid>
but i can't find any here. it saved for final GVE but not for sc.
Also, I want to do error analysis, How can i do the evaluation on single image? If u direct me in right direction
That's strange. All seems to be pointing to the locale setting being the problem here.
Can you please just verify the following for me: Do you get the same error with the line below into the java command?
EVAL ||| 9,0 14,0 3,0 7,0 0,0 0,0 3,0 3,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 1,0 1,0 0,0 0,0 4,0 4,0 4,0
What do you get when you type echo $LANG
?
Regarding the other questions: the sentence classifier uses the same code for saving the checkpoints as the other models, so I can't explain why you would get the best-ckpt.pth
for one but not the other.
When you run the evaluation on a dataset, you should get the files results-val-captions.json
and results-val-metrics-imgs.json
which give you the evaluation results individually per image.
I got the score by typing above input
(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$ java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm
EVAL ||| 9,0 14,0 3,0 7,0 0,0 0,0 3,0 3,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 1,0 1,0 0,0 0,0 4,0 4,0 4,0
0.0712430426716141
0.0712430426716141
The echo $LANG
output is fr_FR.UTF-8
.
Yeah, I can't able to get. I will debug and check that. But the sc-cub-... Directory don't have any best-ckpt.pth
file.
Fixed the issue by setting the locale.
$ LANG=en_US.UTF-8
$ locale | grep LANG
Hi, I am able to get the Meteor score around 28.4 whereas in paper they mentioned 29.2. But I can't able to get close to the CIDEr score, I got 47.3 whereas in paper it is 56.7. The difference is too much.
What value do you use for loss_lambda
? You will have to do a bit of hyperparameter search to get good values. I only remember that 0.2
is a good starting point.
Since GVE has a REINFORCE loss there is quite a bit of variance involved. A reasonable extension would be to introduce a baseline to reduce the variance.
For loss_lambda
, I am using 0.2
. The METEOR
score is close to the one define in paper but not CIDEr
Hi Salaniz, Just two or three questions for my clarification.
CUB_feature_dict.p are the features of CUB dataset, which got by using compact fine-grained classification model? Is that right? Can we use directly the images instead of features?
what does cub_tokens_file is for? And also we have cub_vocab.pkl, cub_vocab_nounk.txt, and vocab.txt.
From where we get the generated sentences for each image? which are saved in descriptions_bird? How they are generated? Are they the ground truth sentences or these are annotated by users?
How can we get the classification score or classification label for CUB dataset using LRCN model? Also How can i get the image? I am getting the 1D tensor from CubDataset class
GVE uses pretrained LRCN network trained on coco. From which checkpoint directory it get the weights if there are so many in checkpoint folder?
Fixed the issue by setting the locale.
$ LANG=en_US.UTF-8
$ locale | grep LANG
May I know how did you do it?
Fixed the issue by setting the locale.
$ LANG=en_US.UTF-8
$ locale | grep LANG
May I know how did you do it?
@rmaeperito2: Maybe you can run export LANG=en_US.UTF-8
in the shell where you want to train the model. But it is only a hotfix and works only for this one window/shell. If you close it you have to call export LANG=en_US.UTF-8
again the next time. At the moment, I don't know a longterm solution.
It seems this issue is indeed solved by setting the locale properly. Either way this is rather an issue with pycocoevalcap. If anyone feels this is important to handle automatically, please open an issue there.
Getting Value error, when calling cocoEval.evaluate in coco_dataset.py on line 211.