Accuracy calculation in test has redundant instances

MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Other

1.36k stars 135 forks source link

Under this section Finetune, it's written that during test, you consider multiple segments and multiple crops

    --test_num_segment 2 \
    --test_num_crop 3 \

But while calculating accuracy, we don't aggregate accuracy scores over all these segments/crops-- https://github.com/MCG-NJU/VideoMAE/blob/main/engine_for_finetuning.py#L180 https://github.com/rwightman/pytorch-image-models/blob/master/timm/utils/metrics.py#L25

acc1, acc5 = accuracy(output, target, topk=(1, 5)) Instead while calculating accuracy we should have done something like this--

scores=pd.dataFrame({ "id" : ids, "outputs" : outputs, "labels" : labels })
scores=scores.groupby([ "ids" , "labels" ]).aggregate({"outputs": lambda x : max(x)})

MCG-NJU / VideoMAE

Accuracy calculation in test has redundant instances #46