leonnnop / VAR

[CVPR 2022] Visual Abductive Reasoning
MIT License
113 stars 11 forks source link

Problems about results in paper #6

Open ordinarycore opened 1 year ago

ordinarycore commented 1 year ago

Hi! How can I reproduce the results reported in paper? I get CIDer at 34.97 for observed events and 37.68 for explanation events. I have run the codes for several times, and results are similar.

LemonQC commented 1 year ago

@ordinarycore my results are as follows. Can you share yours. image

ordinarycore commented 1 year ago

{ "observed": { "Bleu_1": [ 0.2503820488831265 ], "Bleu_2": [ 0.1314744051225057 ], "Bleu_3": [ 0.07295467938248024 ], "Bleu_4": [ 0.04618306165717258 ], "METEOR": [ 0.10662962550579554 ], "ROUGE_L": [ 0.24641933236139163 ], "CIDEr": [ 0.34973493278085094 ] }, "hypothesis": { "Bleu_1": [ 0.2598156230059454 ], "Bleu_2": [ 0.13651595121270682 ], "Bleu_3": [ 0.07889156806042286 ], "Bleu_4": [ 0.05187654022798021 ], "METEOR": [ 0.1079388281077964 ], "ROUGE_L": [ 0.24640443590073993 ], "CIDEr": [ 0.37681978484812273 ] } } However, CIDEr of Hypothesis is much worse than that of observed.

LemonQC commented 1 year ago

The results on CIDEr are lower. My sever is 3090 with ubuntu 18.

GrassBro commented 1 year ago

Hi, all,

One of my results is as below: [Separate Observed] METEOR 10.66 Bleu@4 4.57 CIDEr 35.61 ROUGE_L 24.18 BERT_S 33.76 [Separate Hypothesis] METEOR 10.76 Bleu@4 5.19 CIDEr 37.25 ROUGE_L 24.48 BERT_S 33.21

Intuitively, the observed events should have higher performance. why do I also get the inverse result?

my GPU is one V100 card.

@leonnnop Dear Author, sorry for bothering you. Do you have any suggestions?

Best to all.

Xhhhh2002 commented 6 months ago

The results on CIDEr are lower. My sever is 3090 with ubuntu 18.

Hello, I would like to know how to use the test set for testing. Should I directly change the parameters evaluate_mode to test or do I need to write a separate py file