AP10k Evaluation Results

Hi, I am trying to replicate the evaluation results for AP10k test set using Vitpose+-Base as reported in the paper and the repo. Screenshot from 2024-10-29 17-35-51 The configuration file I used is configs/animal/2d_kpt_sview_rgb_img/topdown_heatmap/ap10k/ViTPose_base_ap10k_256x192.py, and the checkpoint file was downloaded from the onedrive folder vitpose+_base.pth. And I ran the command bash tools/dist_test.sh configs/animal/2d_kpt_sview_rgb_img/topdown_heatmap/ap10k/ViTPose_base_ap10k_256x192.py checkpoints/vitpose+_base.pth 4 I got 0 AP.
/mnt/aperto/anaconda/envs/vitpose/lib/python3.9/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  warnings.warn(
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
apex is not installed
/mnt/aperto/tianyi/ViTPose/mmpose/utils/setup_env.py:42: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
/mnt/aperto/tianyi/ViTPose/mmpose/utils/setup_env.py:42: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
/mnt/aperto/tianyi/ViTPose/mmpose/utils/setup_env.py:42: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
/mnt/aperto/tianyi/ViTPose/mmpose/utils/setup_env.py:42: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
Done (t=0.02s)
creating index...
Done (t=0.02s)
creating index...
index created!
index created!
Done (t=0.02s)
creating index...
index created!
=> num_images: 1997
=> load 2634 samples
=> num_images: 1997
=> load 2634 samples
=> num_images: 1997
=> load 2634 samples
=> num_images: 1997
=> load 2634 samples
Use load_from_local loader
Use load_from_local loader
Use load_from_local loader
Use load_from_local loader
The model and loaded state dict do not match exactly

size mismatch for backbone.blocks.0.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.0.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.1.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.1.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.2.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.2.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.3.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.3.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.4.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.4.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.5.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.5.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.6.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.6.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.7.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.7.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.8.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.8.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.9.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.9.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.10.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.10.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.blocks.11.mlp.fc2.weight: copying a param with shape torch.Size([576, 3072]) from checkpoint, the shape in current model is torch.Size([768, 3072]).
size mismatch for backbone.blocks.11.mlp.fc2.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
unexpected key in source state_dict: associate_keypoint_heads.0.deconv_layers.0.weight, associate_keypoint_heads.0.deconv_layers.1.weight, associate_keypoint_heads.0.deconv_layers.1.bias, associate_keypoint_heads.0.deconv_layers.1.running_mean, associate_keypoint_heads.0.deconv_layers.1.running_var, associate_keypoint_heads.0.deconv_layers.1.num_batches_tracked, associate_keypoint_heads.0.deconv_layers.3.weight, associate_keypoint_heads.0.deconv_layers.4.weight, associate_keypoint_heads.0.deconv_layers.4.bias, associate_keypoint_heads.0.deconv_layers.4.running_mean, associate_keypoint_heads.0.deconv_layers.4.running_var, associate_keypoint_heads.0.deconv_layers.4.num_batches_tracked, associate_keypoint_heads.0.final_layer.weight, associate_keypoint_heads.0.final_layer.bias, associate_keypoint_heads.1.deconv_layers.0.weight, associate_keypoint_heads.1.deconv_layers.1.weight, associate_keypoint_heads.1.deconv_layers.1.bias, associate_keypoint_heads.1.deconv_layers.1.running_mean, associate_keypoint_heads.1.deconv_layers.1.running_var, associate_keypoint_heads.1.deconv_layers.1.num_batches_tracked, associate_keypoint_heads.1.deconv_layers.3.weight, associate_keypoint_heads.1.deconv_layers.4.weight, associate_keypoint_heads.1.deconv_layers.4.bias, associate_keypoint_heads.1.deconv_layers.4.running_mean, associate_keypoint_heads.1.deconv_layers.4.running_var, associate_keypoint_heads.1.deconv_layers.4.num_batches_tracked, associate_keypoint_heads.1.final_layer.weight, associate_keypoint_heads.1.final_layer.bias, associate_keypoint_heads.2.deconv_layers.0.weight, associate_keypoint_heads.2.deconv_layers.1.weight, associate_keypoint_heads.2.deconv_layers.1.bias, associate_keypoint_heads.2.deconv_layers.1.running_mean, associate_keypoint_heads.2.deconv_layers.1.running_var, associate_keypoint_heads.2.deconv_layers.1.num_batches_tracked, associate_keypoint_heads.2.deconv_layers.3.weight, associate_keypoint_heads.2.deconv_layers.4.weight, associate_keypoint_heads.2.deconv_layers.4.bias, associate_keypoint_heads.2.deconv_layers.4.running_mean, associate_keypoint_heads.2.deconv_layers.4.running_var, associate_keypoint_heads.2.deconv_layers.4.num_batches_tracked, associate_keypoint_heads.2.final_layer.weight, associate_keypoint_heads.2.final_layer.bias, associate_keypoint_heads.3.deconv_layers.0.weight, associate_keypoint_heads.3.deconv_layers.1.weight, associate_keypoint_heads.3.deconv_layers.1.bias, associate_keypoint_heads.3.deconv_layers.1.running_mean, associate_keypoint_heads.3.deconv_layers.1.running_var, associate_keypoint_heads.3.deconv_layers.1.num_batches_tracked, associate_keypoint_heads.3.deconv_layers.3.weight, associate_keypoint_heads.3.deconv_layers.4.weight, associate_keypoint_heads.3.deconv_layers.4.bias, associate_keypoint_heads.3.deconv_layers.4.running_mean, associate_keypoint_heads.3.deconv_layers.4.running_var, associate_keypoint_heads.3.deconv_layers.4.num_batches_tracked, associate_keypoint_heads.3.final_layer.weight, associate_keypoint_heads.3.final_layer.bias, associate_keypoint_heads.4.deconv_layers.0.weight, associate_keypoint_heads.4.deconv_layers.1.weight, associate_keypoint_heads.4.deconv_layers.1.bias, associate_keypoint_heads.4.deconv_layers.1.running_mean, associate_keypoint_heads.4.deconv_layers.1.running_var, associate_keypoint_heads.4.deconv_layers.1.num_batches_tracked, associate_keypoint_heads.4.deconv_layers.3.weight, associate_keypoint_heads.4.deconv_layers.4.weight, associate_keypoint_heads.4.deconv_layers.4.bias, associate_keypoint_heads.4.deconv_layers.4.running_mean, associate_keypoint_heads.4.deconv_layers.4.running_var, associate_keypoint_heads.4.deconv_layers.4.num_batches_tracked, associate_keypoint_heads.4.final_layer.weight, associate_keypoint_heads.4.final_layer.bias, backbone.blocks.0.mlp.experts.0.weight, backbone.blocks.0.mlp.experts.0.bias, backbone.blocks.0.mlp.experts.1.weight, backbone.blocks.0.mlp.experts.1.bias, backbone.blocks.0.mlp.experts.2.weight, backbone.blocks.0.mlp.experts.2.bias, backbone.blocks.0.mlp.experts.3.weight, backbone.blocks.0.mlp.experts.3.bias, backbone.blocks.0.mlp.experts.4.weight, backbone.blocks.0.mlp.experts.4.bias, backbone.blocks.0.mlp.experts.5.weight, backbone.blocks.0.mlp.experts.5.bias, backbone.blocks.1.mlp.experts.0.weight, backbone.blocks.1.mlp.experts.0.bias, backbone.blocks.1.mlp.experts.1.weight, backbone.blocks.1.mlp.experts.1.bias, backbone.blocks.1.mlp.experts.2.weight, backbone.blocks.1.mlp.experts.2.bias, backbone.blocks.1.mlp.experts.3.weight, backbone.blocks.1.mlp.experts.3.bias, backbone.blocks.1.mlp.experts.4.weight, backbone.blocks.1.mlp.experts.4.bias, backbone.blocks.1.mlp.experts.5.weight, backbone.blocks.1.mlp.experts.5.bias, backbone.blocks.2.mlp.experts.0.weight, backbone.blocks.2.mlp.experts.0.bias, backbone.blocks.2.mlp.experts.1.weight, backbone.blocks.2.mlp.experts.1.bias, backbone.blocks.2.mlp.experts.2.weight, backbone.blocks.2.mlp.experts.2.bias, backbone.blocks.2.mlp.experts.3.weight, backbone.blocks.2.mlp.experts.3.bias, backbone.blocks.2.mlp.experts.4.weight, backbone.blocks.2.mlp.experts.4.bias, backbone.blocks.2.mlp.experts.5.weight, backbone.blocks.2.mlp.experts.5.bias, backbone.blocks.3.mlp.experts.0.weight, backbone.blocks.3.mlp.experts.0.bias, backbone.blocks.3.mlp.experts.1.weight, backbone.blocks.3.mlp.experts.1.bias, backbone.blocks.3.mlp.experts.2.weight, backbone.blocks.3.mlp.experts.2.bias, backbone.blocks.3.mlp.experts.3.weight, backbone.blocks.3.mlp.experts.3.bias, backbone.blocks.3.mlp.experts.4.weight, backbone.blocks.3.mlp.experts.4.bias, backbone.blocks.3.mlp.experts.5.weight, backbone.blocks.3.mlp.experts.5.bias, backbone.blocks.4.mlp.experts.0.weight, backbone.blocks.4.mlp.experts.0.bias, backbone.blocks.4.mlp.experts.1.weight, backbone.blocks.4.mlp.experts.1.bias, backbone.blocks.4.mlp.experts.2.weight, backbone.blocks.4.mlp.experts.2.bias, backbone.blocks.4.mlp.experts.3.weight, backbone.blocks.4.mlp.experts.3.bias, backbone.blocks.4.mlp.experts.4.weight, backbone.blocks.4.mlp.experts.4.bias, backbone.blocks.4.mlp.experts.5.weight, backbone.blocks.4.mlp.experts.5.bias, backbone.blocks.5.mlp.experts.0.weight, backbone.blocks.5.mlp.experts.0.bias, backbone.blocks.5.mlp.experts.1.weight, backbone.blocks.5.mlp.experts.1.bias, backbone.blocks.5.mlp.experts.2.weight, backbone.blocks.5.mlp.experts.2.bias, backbone.blocks.5.mlp.experts.3.weight, backbone.blocks.5.mlp.experts.3.bias, backbone.blocks.5.mlp.experts.4.weight, backbone.blocks.5.mlp.experts.4.bias, backbone.blocks.5.mlp.experts.5.weight, backbone.blocks.5.mlp.experts.5.bias, backbone.blocks.6.mlp.experts.0.weight, backbone.blocks.6.mlp.experts.0.bias, backbone.blocks.6.mlp.experts.1.weight, backbone.blocks.6.mlp.experts.1.bias, backbone.blocks.6.mlp.experts.2.weight, backbone.blocks.6.mlp.experts.2.bias, backbone.blocks.6.mlp.experts.3.weight, backbone.blocks.6.mlp.experts.3.bias, backbone.blocks.6.mlp.experts.4.weight, backbone.blocks.6.mlp.experts.4.bias, backbone.blocks.6.mlp.experts.5.weight, backbone.blocks.6.mlp.experts.5.bias, backbone.blocks.7.mlp.experts.0.weight, backbone.blocks.7.mlp.experts.0.bias, backbone.blocks.7.mlp.experts.1.weight, backbone.blocks.7.mlp.experts.1.bias, backbone.blocks.7.mlp.experts.2.weight, backbone.blocks.7.mlp.experts.2.bias, backbone.blocks.7.mlp.experts.3.weight, backbone.blocks.7.mlp.experts.3.bias, backbone.blocks.7.mlp.experts.4.weight, backbone.blocks.7.mlp.experts.4.bias, backbone.blocks.7.mlp.experts.5.weight, backbone.blocks.7.mlp.experts.5.bias, backbone.blocks.8.mlp.experts.0.weight, backbone.blocks.8.mlp.experts.0.bias, backbone.blocks.8.mlp.experts.1.weight, backbone.blocks.8.mlp.experts.1.bias, backbone.blocks.8.mlp.experts.2.weight, backbone.blocks.8.mlp.experts.2.bias, backbone.blocks.8.mlp.experts.3.weight, backbone.blocks.8.mlp.experts.3.bias, backbone.blocks.8.mlp.experts.4.weight, backbone.blocks.8.mlp.experts.4.bias, backbone.blocks.8.mlp.experts.5.weight, backbone.blocks.8.mlp.experts.5.bias, backbone.blocks.9.mlp.experts.0.weight, backbone.blocks.9.mlp.experts.0.bias, backbone.blocks.9.mlp.experts.1.weight, backbone.blocks.9.mlp.experts.1.bias, backbone.blocks.9.mlp.experts.2.weight, backbone.blocks.9.mlp.experts.2.bias, backbone.blocks.9.mlp.experts.3.weight, backbone.blocks.9.mlp.experts.3.bias, backbone.blocks.9.mlp.experts.4.weight, backbone.blocks.9.mlp.experts.4.bias, backbone.blocks.9.mlp.experts.5.weight, backbone.blocks.9.mlp.experts.5.bias, backbone.blocks.10.mlp.experts.0.weight, backbone.blocks.10.mlp.experts.0.bias, backbone.blocks.10.mlp.experts.1.weight, backbone.blocks.10.mlp.experts.1.bias, backbone.blocks.10.mlp.experts.2.weight, backbone.blocks.10.mlp.experts.2.bias, backbone.blocks.10.mlp.experts.3.weight, backbone.blocks.10.mlp.experts.3.bias, backbone.blocks.10.mlp.experts.4.weight, backbone.blocks.10.mlp.experts.4.bias, backbone.blocks.10.mlp.experts.5.weight, backbone.blocks.10.mlp.experts.5.bias, backbone.blocks.11.mlp.experts.0.weight, backbone.blocks.11.mlp.experts.0.bias, backbone.blocks.11.mlp.experts.1.weight, backbone.blocks.11.mlp.experts.1.bias, backbone.blocks.11.mlp.experts.2.weight, backbone.blocks.11.mlp.experts.2.bias, backbone.blocks.11.mlp.experts.3.weight, backbone.blocks.11.mlp.experts.3.bias, backbone.blocks.11.mlp.experts.4.weight, backbone.blocks.11.mlp.experts.4.bias, backbone.blocks.11.mlp.experts.5.weight, backbone.blocks.11.mlp.experts.5.bias

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 2636/2634, 954.7 task/s, elapsed: 3s, ETA:     0sLoading and preparing results...
DONE (t=0.07s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *keypoints*
DONE (t=1.17s).
Accumulating evaluation results...
DONE (t=0.06s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] =  0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] =  0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] =  0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] =  0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] =  0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] =  0.000
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] =  0.000
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] =  0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] =  0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] =  0.000
AP: 0.0
AP (L): 0.0
AP (M): 0.0
AP .5: 0.0
AP .75: 0.0
AR: 0.0
AR (L): 0.0
AR (M): 0.0
AR .5: 0.0
AR .75: 0.0
ViTAE-Transformer / ViTPose

AP10k Evaluation Results #152