Open emmating12 opened 6 months ago
Hi! Could you provide your environment list, like torch and CUDA version?
Hi, python=3.10.13, torch=1.13.1+cu117, torchvision=0.14.1+cu117, cuda=11.7.
For me, code is run at A100 with
Python=3.7.12
cuda=11.7
torch=1.13.1+cu117
torchvision=0.14.1+cu117
For me, code is run at A100 with
Python=3.7.12 cuda=11.7 torch=1.13.1+cu117 torchvision=0.14.1+cu117
Hi, I have tested the VideoChat2 model on A100, python=3.8, torch=1.13.1+cu117, torchvision=0.14.1+cu117, cuda=11.7. The result for "Episodic Reasoning" is 38.5% different from the paper. The other results are the same. Could you help me find the reasons?
Hi! I think the reason is that you use the old version of the inference code. In the new version, I set True
to use the temporal boundary, which improves the results slightly.
@emmating12 Hi, we have the same reproduction results. Did you find a way to reproduce the performance on Episodic Reasoning?
@Andy1621 Thanks for the info. I used the mvbench.ipynb
with the True
for Episodic Reasoning but the performance is still 38.5% instead of 40.5%. Do you have any other suggestions?
Hi! I'm not sure whether you have inferred the model correctly.
Originally when I tested MVBench, I forgot to use start
and end
for TVQA
, thus achieving 38.5%
as yours.
But when I fixed the bug and used start
and end
(setting True
), the result increased as expected, obtaining 40.5%
.
Hi, I have tested the VideoChat2 model on my server and found that the test results are different from the paper. My results are listed as follows: {"Action Sequence": 66.0, "Action Prediction": 47.5, "Action Antonym": 83.5, "Fine-grained Action": 49.5, "Unexpected Action": 60.0, "Object Existence": 57.99999999999999, "Object Interaction": 71.5, "Object Shuffle": 41.5, "Moving Direction": 23.0, "Action Localization": 22.5, "Scene Transition": 88.5, "Action Count": 39.5, "Moving Count": 42.0, "Moving Attribute": 58.5, "State Change": 44.0, "Fine-grained Pose": 49.0, "Character Order": 36.5, "Egocentric Navigation": 35.0, "Episodic Reasoning": 38.5, "Counterfactual Inference": 65.0, "Avg": 50.975} The results for OS, AL, AC, ER, and CI are different. Could you help me find the reasons?