OpenGVLab / UniFormerV2

[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
https://arxiv.org/abs/2211.09552
Apache License 2.0
294 stars 19 forks source link

Calculating average of predicted results #29

Closed anjugopinath closed 1 year ago

anjugopinath commented 1 year ago

Hi,

I am performing evaluation on 10 images. I know that the code samples N clips and 3 crops from a single input video.

The below code is from the perform_test() function inside the file test_net.py

image

Below are the results of the 2 print statements marked as 1 and 2 in green color respectively:

image

Why is one of size 28 and the other of size 10? Where in the code do you perform averaging to calculate the final results?

Andy1621 commented 1 year ago

Note that the pred is the prediction in the current forward. Thus the pred size is equal to batch_size x num_clips x num_crops.

However, the test_meter.video_preds stores the predictions of the whole test datasets, its length is num_test_videos.

The testing is calculated in utils/meters.py.