salaniz / pytorch-gve-lrcn

PyTorch implementations for "Generating Visual Explanations" (GVE) and "Long-term Recurrent Convolutional Networks" (LRCN)
MIT License
92 stars 22 forks source link

ValueError: Couldnot convert string to float #9

Closed muneebable closed 1 year ago

muneebable commented 5 years ago

Getting Value error, when calling cocoEval.evaluate in coco_dataset.py on line 211.

ratio: 1.0076053968122858
Bleu_1: 0.644
Bleu_2: 0.456
Bleu_3: 0.309
Bleu_4: 0.209
computing METEOR score...
Traceback (most recent call last):
  File "main.py", line 106, in <module>
    score = val_dataset.eval(result, checkpoint_path)
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/utils/data/coco_dataset.py", line 211, in eval
    cocoEval.evaluate()
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/eval.py", line 50, in evaluate
    score, scores = scorer.compute_score(gts, res)
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/meteor/meteor.py", line 43, in compute_score
    scores.append(float(self.meteor_p.stdout.readline().strip()))
ValueError: could not convert string to float:
salaniz commented 5 years ago

Can you tell me which Python version you are using and what the value of self.meteor_p.stdout.readline().strip() is before the error happens?

muneebable commented 5 years ago

Hey, I am using python 3.5.6. Also I print the value of this expression and I got

Epoch [0/50], Step [310/317]
Loading and preparing results...
DONE (t=0.10s)
creating index...
index created!
==================================================cocoeval=========================================
<pycocoevalcap.eval.COCOEvalCap object at 0x7f07a2ee4e48>
<class 'pycocoevalcap.eval.COCOEvalCap'>
tokenization...
PTBTokenizer tokenized 2492307 tokens at 2682485,86 tokens per second.
PTBTokenizer tokenized 426774 tokens at 1468350,47 tokens per second.
setting up scorers...
computing Bleu score...
{'guess': [386271, 345767, 305263, 264759], 'testlen': 386271, 'correct': [248500, 112587, 45275, 18918], 'reflen': 3845
06}
ratio: 1.0045903054828766
Bleu_1: 0.643
Bleu_2: 0.458
Bleu_3: 0.314
Bleu_4: 0.217
computing METEOR score...
================
b''
============================
Traceback (most recent call last):
  File "main.py", line 106, in <module>
    score = val_dataset.eval(result, checkpoint_path)
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/utils/data/coco_dataset.py", line 214, in eval
    cocoEval.evaluate()
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/eval.py", line 50, in evaluate
    score, scores = scorer.compute_score(gts, res)
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/meteor/meteor.py", line 46, in compute_score
    scores.append(float(self.meteor_p.stdout.readline().strip()))
ValueError: could not convert string to float:

I don't know why it is getting b'' value

muneebable commented 5 years ago

Try this, Maybe this will fix the bug: https://github.com/tylin/coco-caption/pull/35/files

Currently, I am not calculating Meteor score. I will try maybe after a week and will post it here if it works.

salaniz commented 5 years ago

I can't reproduce this. Can you check what the value of eval_line is in your case?

It should be something like this: EVAL ||| 9.0 14.0 3.0 7.0 0.0 0.0 3.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 4.0 4.0 4.0

You can test if the meteor jar works on your system by navigating to the meteor directory (where meteor.py is located) and check that you can do the following:

  1. Execute java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm (it should wait for standard input)
  2. Copy and paste the above string (or your eval string), hit enter, and wait for a response, which should be floating point numbers

You might want to backtrack why your string is empty by looking into the values that are fed into the _stat() function and by validating that the content of your score_line is returning proper results with the same procedure as described above (evaluating the line directly by calling the meteor jar with java).

The raw values come from the data and the model itself, so there might be something wrong outside the meteor script.

muneebable commented 5 years ago

Okay . I will have a look.

One more thing. I trained the whole model but CiDer score mentioned in paper is around 56.7. I am not getting that. How do you select the best checkpoint? because from checkpoint directory, i can't able to see the best checkpoint.

salaniz commented 5 years ago

I am assuming you mean the GVE model on CUB. Try to tune the loss_lambda parameter a bit to get better results, 0.2 should be a good starting point.

I am creating a symbolic link named best-ckpt.pth pointing to the checkpoint with the highest validation performance. So in the checkpoint directory you should see this link. This might not be available if you are using Windows though. If that's the case, you could simply change that part of the code to either save it as a file instead of a link or simply save the name of the best checkpoint in a text file.

https://github.com/salaniz/pytorch-gve-lrcn/blob/b980162b280d46e4c09df91d93c0c11aa4ff4761/main.py#L119-L125

muneebable commented 5 years ago

Okay I tried to print the values and this is what i got:

============================stat
10.0 11.0 5.0 7.0 2.0 2.0 4.0 4.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 5.0 7.0 7.0
stat================================
score line============================
SCORE ||| a small toaster oven sitting on top of a wooden table ||| toaster oven set up on table in a communal room ||| a toaster oven is sitting on top of a table in a classro
om ||| a toaster oven in a state of mild dis-assembly on a workbench ||| a toaster oven two people sitting at a table and a lady walking with a purse ||| a refrigerator with a
refrigerator and a microwave
============================stat
8.0 16.0 5.0 9.0 0.0 0.0 5.0 5.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.0 5.0 5.0
stat================================
================
b''
============================
Traceback (most recent call last):
  File "main.py", line 106, in <module>
    score = val_dataset.eval(result, checkpoint_path)
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/utils/data/coco_dataset.py", line 214, in eval
    cocoEval.evaluate()
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/eval.py", line 50, in evaluate
    score, scores = scorer.compute_score(gts, res)
  File "/data1/home/mrim/hassanm/gve/pytorch-vision-language/pycocoevalcap/meteor/meteor.py", line 49, in compute_score
    scores.append(float(self.meteor_p.stdout.readline().strip()))
ValueError: could not convert string to float:

print(self.meteor_p.stdout.readline().strip()) This line print b'' . after it gives a float error on next line

salaniz commented 5 years ago

Make sure you don't call self.meteor_p.stdout.readline().strip() twice when you add a print statement. Store it in a temporary variable instead. Otherwise you will not evaluate what you are printing.

Can you please try to run java directly as I described above to verify that it works.

Which OS are you on?

muneebable commented 5 years ago

Yeah, I called that only one time. I am using Ubuntu. Yeah I try to run java directly and got this:

(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$ java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm    9
EVAL ||| 9.0 14.0 3.0 7.0 0.0 0.0 3.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 4.0 4.0                                       9
Exception in thread "main" java.util.InputMismatchException
        at java.util.Scanner.throwFor(Scanner.java:864)                                                                                 9
        at java.util.Scanner.next(Scanner.java:1485)
        at java.util.Scanner.nextDouble(Scanner.java:2413)
        at edu.cmu.meteor.scorer.MeteorStats.<init>(Unknown Source)
        at Meteor.scoreStdio(Unknown Source)                                                                                            9
        at Meteor.main(Unknown Source)
(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$
salaniz commented 5 years ago

It seems like you missed a number at the end of the string. Try again with the full line that I gave you above. Also try the content of eval_line during your execution, because you probably want to verify that your pipeline works.

muneebable commented 5 years ago

Still the same error. :(

(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$ java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm
EVAL ||| 9.0 14.0 3.0 7.0 0.0 0.0 3.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 4.0 4.0 4.0
Exception in thread "main" java.util.InputMismatchException
        at java.util.Scanner.throwFor(Scanner.java:864)
        at java.util.Scanner.next(Scanner.java:1485)
        at java.util.Scanner.nextDouble(Scanner.java:2413)
        at edu.cmu.meteor.scorer.MeteorStats.<init>(Unknown Source)
        at Meteor.scoreStdio(Unknown Source)
        at Meteor.main(Unknown Source)
(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor               
salaniz commented 5 years ago

Ok, then it seems to be a problem with the meteor-1.5.jar. Can you please check out the fixes that other people proposed here: https://github.com/tylin/coco-caption/issues/6

muneebable commented 5 years ago

Yeah I try that also. Nothing seems to be working now. Also, After running python main.py --model sc --dataset cub. We should find the best-ckpt.pth in ./checkpoints/sc-cub-D<date>-T<time>-G<GPUid> but i can't find any here. it saved for final GVE but not for sc.

Also, I want to do error analysis, How can i do the evaluation on single image? If u direct me in right direction

salaniz commented 5 years ago

That's strange. All seems to be pointing to the locale setting being the problem here. Can you please just verify the following for me: Do you get the same error with the line below into the java command? EVAL ||| 9,0 14,0 3,0 7,0 0,0 0,0 3,0 3,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 1,0 1,0 0,0 0,0 4,0 4,0 4,0

What do you get when you type echo $LANG?

Regarding the other questions: the sentence classifier uses the same code for saving the checkpoints as the other models, so I can't explain why you would get the best-ckpt.pth for one but not the other. When you run the evaluation on a dataset, you should get the files results-val-captions.json and results-val-metrics-imgs.json which give you the evaluation results individually per image.

muneebable commented 5 years ago

I got the score by typing above input

(pytorch) hassanm@decore1:~/gve/pytorch-vision-language/pycocoevalcap/meteor$ java -jar -Xmx2G meteor-1.5.jar - - -stdio -l en -norm
EVAL ||| 9,0 14,0 3,0 7,0 0,0 0,0 3,0 3,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 1,0 1,0 0,0 0,0 4,0 4,0 4,0
0.0712430426716141
0.0712430426716141

The echo $LANG output is fr_FR.UTF-8.

Yeah, I can't able to get. I will debug and check that. But the sc-cub-... Directory don't have any best-ckpt.pth file.

muneebable commented 5 years ago

Fixed the issue by setting the locale. $ LANG=en_US.UTF-8 $ locale | grep LANG

muneebable commented 5 years ago

Hi, I am able to get the Meteor score around 28.4 whereas in paper they mentioned 29.2. But I can't able to get close to the CIDEr score, I got 47.3 whereas in paper it is 56.7. The difference is too much.

salaniz commented 5 years ago

What value do you use for loss_lambda? You will have to do a bit of hyperparameter search to get good values. I only remember that 0.2 is a good starting point.

Since GVE has a REINFORCE loss there is quite a bit of variance involved. A reasonable extension would be to introduce a baseline to reduce the variance.

muneebable commented 5 years ago

For loss_lambda, I am using 0.2. The METEOR score is close to the one define in paper but not CIDEr

muneebable commented 5 years ago

Hi Salaniz, Just two or three questions for my clarification.

  1. CUB_feature_dict.p are the features of CUB dataset, which got by using compact fine-grained classification model? Is that right? Can we use directly the images instead of features?

  2. what does cub_tokens_file is for? And also we have cub_vocab.pkl, cub_vocab_nounk.txt, and vocab.txt.

  3. From where we get the generated sentences for each image? which are saved in descriptions_bird? How they are generated? Are they the ground truth sentences or these are annotated by users?

  4. How can we get the classification score or classification label for CUB dataset using LRCN model? Also How can i get the image? I am getting the 1D tensor from CubDataset class

  5. GVE uses pretrained LRCN network trained on coco. From which checkpoint directory it get the weights if there are so many in checkpoint folder?

rmaeperito2 commented 2 years ago

Fixed the issue by setting the locale. $ LANG=en_US.UTF-8 $ locale | grep LANG

May I know how did you do it?

regnujAx commented 2 years ago

Fixed the issue by setting the locale. $ LANG=en_US.UTF-8 $ locale | grep LANG

May I know how did you do it?

@rmaeperito2: Maybe you can run export LANG=en_US.UTF-8 in the shell where you want to train the model. But it is only a hotfix and works only for this one window/shell. If you close it you have to call export LANG=en_US.UTF-8 again the next time. At the moment, I don't know a longterm solution.

salaniz commented 1 year ago

It seems this issue is indeed solved by setting the locale properly. Either way this is rather an issue with pycocoevalcap. If anyone feels this is important to handle automatically, please open an issue there.