microsoft / Cream

This is a collection of our NAS and Vision Transformer work.
MIT License
1.61k stars 220 forks source link

Some evaluation problems about TinyViT #226

Closed maybeliuchuan closed 4 months ago

maybeliuchuan commented 4 months ago

Hello, thank you for bringing such a meaningful work.

I meet a problem when I trying to evaluate the TinyVit model using the given code: python -m torch.distributed.launch --nproc_per_node 8 main.py --cfg configs/22kto1k/tiny_vit_5m_22kto1k.yaml --data-path ./ImageNet --batch-size 128 --eval --resume ./checkpoints/tiny_vit_5m_22kto1k_distill.pth.

I have prepared the dataset and chechpoints file. However, when I try to evaluate the accuracy, the ave top-1 accuracy is only 0.1%, which seems that the model is with random parameters and the checkpoint is not properly loaded. After checking the main code, I haven't solve this problem. Is there something wrong with the provided chechpoints file in the model zoo? Could u please help me and give me some suggestion on this? Thanks a lot.

The following is part of the output_log: [2024-02-28 09:36:24 TinyViT-5M-1k](main.py 85): INFO number of params: 5392764 All checkpoints founded in output/TinyViT-5M-1k/default: [] [2024-02-28 09:36:24 TinyViT-5M-1k](main.py 121): INFO no checkpoint found in output/TinyViT-5M-1k/default, ignoring auto resume [2024-02-28 09:36:24 TinyViT-5M-1k](utils.py 58): INFO ==============> Resuming form ./checkpoints/tiny_vit_5m_1k.pth.................... All checkpoints founded in output/TinyViT-5M-1k/default: [] [2024-02-28 09:36:24 TinyViT-5M-1k](utils.py 92): INFO [2024-02-28 09:36:31 TinyViT-5M-1k](main.py 441): INFO Test: [0/196] Time 6.721 (6.721) Loss 8.4219 (8.4219) Acc@1 0.000 (0.000) Acc@5 0.000 (0.000) Mem 2393MB [2024-02-28 09:36:37 TinyViT-5M-1k](main.py 441): INFO Test: [10/196] Time 0.909 (1.169) Loss 8.3828 (8.4226) Acc@1 0.000 (0.000) Acc@5 0.000 (0.142) Mem 2393MB [2024-02-28 09:36:41 TinyViT-5M-1k](main.py 441): INFO Test: [20/196] Time 0.083 (0.787) Loss 8.3438 (8.4066) Acc@1 0.000 (0.000) Acc@5 0.000 (0.186) Mem 2393MB [2024-02-28 09:36:45 TinyViT-5M-1k](main.py 441): INFO Test: [30/196] Time 0.109 (0.684) Loss 8.2734 (8.3916) Acc@1 0.000 (0.050) Acc@5 0.000 (0.277) Mem 2393MB [2024-02-28 09:36:50 TinyViT-5M-1k](main.py 441): INFO Test: [40/196] Time 0.600 (0.626) Loss 8.4297 (8.3893) Acc@1 0.000 (0.095) Acc@5 0.000 (0.305) Mem 2393MB [2024-02-28 09:36:57 TinyViT-5M-1k](main.py 441): INFO Test: [50/196] Time 0.528 (0.636) Loss 8.5078 (8.3879) Acc@1 0.000 (0.123) Acc@5 0.000 (0.291) Mem 2393MB [2024-02-28 09:37:01 TinyViT-5M-1k](main.py 441): INFO Test: [60/196] Time 0.076 (0.603) Loss 8.3984 (8.3942) Acc@1 0.000 (0.102) Acc@5 0.000 (0.282) Mem 2393MB [2024-02-28 09:37:05 TinyViT-5M-1k](main.py 441): INFO Test: [70/196] Time 0.078 (0.574) Loss 8.3203 (8.3919) Acc@1 0.781 (0.110) Acc@5 0.781 (0.286) Mem 2393MB [2024-02-28 09:37:09 TinyViT-5M-1k](main.py 441): INFO Test: [80/196] Time 0.096 (0.556) Loss 8.4609 (8.3921) Acc@1 0.000 (0.106) Acc@5 0.781 (0.299) Mem 2393MB [2024-02-28 09:37:16 TinyViT-5M-1k](main.py 441): INFO Test: [90/196] Time 0.230 (0.567) Loss 8.3516 (8.3912) Acc@1 0.000 (0.094) Acc@5 0.000 (0.283) Mem 2393MB [2024-02-28 09:37:20 TinyViT-5M-1k](main.py 441): INFO Test: [100/196] Time 0.078 (0.553) Loss 8.4219 (8.3875) Acc@1 0.000 (0.124) Acc@5 0.781 (0.309) Mem 2393MB [2024-02-28 09:37:24 TinyViT-5M-1k](main.py 441): INFO Test: [110/196] Time 0.076 (0.540) Loss 8.5547 (8.3884) Acc@1 0.000 (0.120) Acc@5 0.000 (0.303) Mem 2393MB [2024-02-28 09:37:28 TinyViT-5M-1k](main.py 441): INFO Test: [120/196] Time 0.128 (0.531) Loss 8.4453 (8.3907) Acc@1 0.000 (0.110) Acc@5 0.000 (0.291) Mem 2393MB [2024-02-28 09:37:35 TinyViT-5M-1k](main.py 441): INFO Test: [130/196] Time 0.218 (0.540) Loss 8.5391 (8.3936) Acc@1 0.000 (0.107) Acc@5 0.000 (0.292) Mem 2393MB [2024-02-28 09:37:39 TinyViT-5M-1k](main.py 441): INFO Test: [140/196] Time 0.149 (0.529) Loss 8.3906 (8.3966) Acc@1 0.000 (0.100) Acc@5 0.781 (0.283) Mem 2393MB [2024-02-28 09:37:43 TinyViT-5M-1k](main.py 441): INFO Test: [150/196] Time 0.076 (0.520) Loss 8.3203 (8.3954) Acc@1 0.000 (0.098) Acc@5 1.562 (0.316) Mem 2393MB [2024-02-28 09:37:46 TinyViT-5M-1k](main.py 441): INFO Test: [160/196] Time 0.083 (0.512) Loss 8.2812 (8.3932) Acc@1 0.000 (0.102) Acc@5 0.000 (0.306) Mem 2393MB [2024-02-28 09:37:53 TinyViT-5M-1k](main.py 441): INFO Test: [170/196] Time 0.087 (0.521) Loss 8.4453 (8.3957) Acc@1 0.000 (0.101) Acc@5 0.781 (0.302) Mem 2393MB [2024-02-28 09:37:58 TinyViT-5M-1k](main.py 441): INFO Test: [180/196] Time 0.077 (0.517) Loss 8.3125 (8.3944) Acc@1 0.000 (0.095) Acc@5 0.000 (0.306) Mem 2393MB [2024-02-28 09:38:02 TinyViT-5M-1k](main.py 441): INFO Test: [190/196] Time 0.076 (0.510) Loss 8.4688 (8.3941) Acc@1 0.000 (0.094) Acc@5 0.000 (0.295) Mem 2393MB [2024-02-28 09:38:03 TinyViT-5M-1k](main.py 451): INFO The number of validation samples is 50000 [2024-02-28 09:38:03 TinyViT-5M-1k](main.py 453): INFO * Acc@1 0.090 Acc@5 0.296 [2024-02-28 09:38:03 TinyViT-5M-1k](main.py 128): INFO Accuracy of the network on the 50000 test images: 0.1%