Evaluation results on testing set

doc-doc / NExT-OE

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

MIT License

27 stars 1 forks source link

Evaluation results on testing set #3

Open HopLee6 opened 1 year ago

HopLee6 commented 1 year ago

I have trained the HGA model and evaluated the model on the testing set. But WUPS is 23-24.

I also tested the generated answers provided in this repository (HGA-same-att-qns23ans7-test.json), and the WUPS is 24.01.

So how can I reproduce the testing results reported in the paper.

Besides, I trained my blinded QA model and the results can reach 23, which is similar to the VideoQA model. The visual information seems to be NOT helpful in this task.

HU-xiaobai commented 1 year ago

@HopLee6 hello! could I ask how could you evaluate the test set WUPS? I use the commend "./main.sh 0 test" but it get following error:

Have you met the error before?

Thanks for your reading!

HopLee6 commented 1 year ago

@HU-xiaobai I solved your error by adding following codes after Line 49 in eval_oe.py

if not video in res:
    continue
if not qid in res[video]:
    continue

HU-xiaobai commented 1 year ago

@HopLee6 Sorry to bother you again. It seems that the error still happens (figure below) and the final .py document which has a problem is the sample_loader.py? could you meet the problem before? By the way, do I add your code correctly(the second figure)?

HopLee6 commented 1 year ago

@HU-xiaobai Sorry but I did not come across this error. I suggest you manually check the missing keys.

HU-xiaobai commented 1 year ago

@HopLee6 OK and Thanks for your idea! I will check the code in detail and check with missing keys!

doc-doc commented 1 year ago

Hi, please do not change the evaluation file. Basically, every video-question pair can get a prediction, if not, the problem should be in the prediction part but not evaluation. I can reproduce the exact results as reported in the paper with both the provided prediction file or the model. Please try to set up the environment correctly, e.g., pytorch==1.6.0 (cuda version 10.2 or 11.1)

HU-xiaobai commented 1 year ago

@doc-doc Thanks for your response! Could ask I which GPU you use for running the document?

When I install pytorch 1.6 and cuda 10.2, it always has the warning and could not run the validation commend(the figure below). However, when I use pytorch 1.11 and 11.1 which are compatible with my GPU(3090, 3080, A500, A4000 I all try), the validation can run and result 20.44(the second picture) but it fails the test commend(the error is the figure in the above comment). Finally, when I download the "conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch" with my conda environment, it gets stuck in the validation command.(the third figure)

Do you know how could I solve the problem? (by the way, when I run the test command, because the folder "\data\feats\vid_feat" lack of "app_mot_test.h5" so I copy "app_mot_val.h5" and rename it as "app_mot_test.h5". I am not sure if it has an influence or not?)

Thanks for you reading and time! I am looking forward to your reply!

doc-doc commented 1 year ago

The code was tested on TITAN XP & V100 with pytorch 1.6.0. Cuda version can be 10.2 or 11.0/1/5..

It does not make sense to change the name; the feature should correspond to the videos..You can get the feature from the multi-choice repo o via link.

HU-xiaobai commented 1 year ago

@doc-doc it is really helpful! I can reproduce the results now and , yes with the environment pytorch 1.6.0 and cuda 10.2 I can reproduce the test result(the figure below). By the way, I see for the test set, it includes the document "bert_ft_test.h5". Could I ask is it the video representation abstracted from BERT? what is the effect of it? could we use it to replace the original "vid_feat" and use it for NExT-OE(HGA) model or it is just used for the models of multiple answer choice?

Thanks for your help and also @HopLee6 help!