czczup / ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
https://arxiv.org/abs/2205.08534
Apache License 2.0
1.26k stars 139 forks source link

Poor performance of Mask2Former based on coco-stuff-164k #141

Open clxia12 opened 1 year ago

clxia12 commented 1 year ago

Hi! It is an excellent work. By the way, I would like to ask a question about "Mask2Former beitv2+coco-stuff-164k”.

When I reproduced the experiment of “mask2former_beitv2_adapter_large_896_80k_cocostuff164k”, the indicators were abnormal, and the detailed information is as follows.

+-------+-------+-------+
| aAcc | mIoU | mAcc |
+-------+-------+-------+
| 67.52 | 37.57 | 48.97 |
+-------+-------+-------+
2023-10-13 15:14:11,850 - mmseg - INFO - The previous best checkpoint /root/paddlejob/workspace/env_run/xiachunlong/models/baidu/adu-lab/foundation_model_reasearch/ViT-Adapter/segmentatio n/work_dirs/mask2former_beitv2_adapter_large_896_80k_cocostuff164k_ss/best_mIoU_iter_6000.pth was removed
2023-10-13 15:14:29,770 - mmseg - INFO - Now best checkpoint is saved as best_mIoU_iter_8000.pth.
2023-10-13 15:14:29,771 - mmseg - INFO - Best mIoU is 0.3757 at 8000 iter.

yours: +-------+------+-------+ | aAcc | mIoU | mAcc | +-------+------+-------+ | 70.74 | 46.1 | 58.27 | +-------+------+-------+

Do the authors know what the problem is?

czczup commented 1 year ago

Hello, thank you for your feedback. Can you give me some information about your environment?

clxia12 commented 1 year ago

4 Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
5 CUDA available: True
6 GPU 0,1,2,3,4,5,6,7: NVIDIA A800-SXM4-80GB
7 CUDA_HOME: /usr/local/cuda
8 NVCC: Build cuda_11.0_bu.TC445_37.28845127_0
9 GCC: gcc (GCC) 8.2.0
10 PyTorch: 1.7.1+cu110
11 PyTorch compiling details: PyTorch built with:
12 - GCC 7.3
13 - C++ Version: 201402
14 - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
15 - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
16 - OpenMP 201511 (a.k.a. OpenMP 4.5)
17 - NNPACK is enabled
18 - CPU capability usage: AVX2
19 - CUDA Runtime 11.0
20 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode; 21 - CuDNN 8.0.5
22 - Magma 2.5.2
23 - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSEPYTORCH 24
25 TorchVision: 0.8.2+cu110
26 OpenCV: 3.4.15
27 MMCV: 1.4.2
28 MMCV Compiler: GCC 7.3
29 MMCV CUDA Compiler: 11.0
30 MMSegmentation: 0.20.2+c5d6218
31 ------------------------------------------------------------

czczup commented 1 year ago

Could you please confirm whether the pre-trained weights have been successfully loaded?

czczup commented 1 year ago

I just checked the config, and I think one possible reason is that I ran this experiment using 2 nodes, resulting in a total batch size of 16. If you are using only 1 node, you need to modify the config by changing the batch size per GPU from 1 to 2.

clxia12 commented 1 year ago

Perhaps the number of nodes is crucial, let me give it a try

bio-mlhui commented 4 months ago

I also encounter the same problem of Mask2former+coco164k+beitv2-L+80k pretraining. The first 8000 iters mIou/mAcc is 37.69 | 49.31. I am trying setting sampels-per-gpu to 2