image size에 따른 성능 변화

boostcampaitech7 / level2-objectdetection-cv-18

level2-objectdetection-cv-18 created by GitHub Classroom

0 stars 3 forks source link

Closed taehan79-kim closed 4 weeks ago

taehan79-kim commented 1 month ago

co_dino_5scale_lsj_swin_large_3x_coco_12e를 이용하여 image size (1280, 1280)과 (512, 512)의 결과를 비교
co_dino_5scale_lsj_swin_tiny_12e를 이용하여 image size (1024, 1024)와 (1536, 1536)의 결과를 비교
Enhanced Deep Residual Networks for Single Image Super-Resolution을 활용하여 (1536, 1536)로 늘리고 2번과 결과를 비교

taehan79-kim commented 1 month ago

co_dino_5scale_lsj_swin_large_3x_coco_12e의 단순 resize방식 image size (1280, 1280)과 (512, 512)의 mAP50(Test) 결과 비교
- (1280, 1280) = 0.7190
- (512, 512) = 0.6686

taehan79-kim commented 1 month ago

co_dino_5scale_lsj_swin_tiny_8e를 이용하여 image size (1024, 1024)와 (1536, 1536)의 mAP50(Val)결과를 비교
- (1024, 1024) : 0.4160
- (1536, 1536) : 0.0720
- 특이사항 : (1536, 1536)의 학습이 안되는 관계로 8 epoch에서 early stopping 진행

Resize를 통해 이미지 크기를 (1024, 1024)에서 (1536, 1536)으로 올렸을 때 오히려 성능이 매우 떨어지는 것을 확인
(1536, 1536)의 객체 크기 별 결과: (bbox_mAP_s: 0.0000, bbox_mAP_m: 0.0030, bbox_mAP_l: 0.0570)
swin tiny의 window의 크기(7 x 7) 및 모델 아키텍처의 구조 상 매우 큰 이미지에서는 학습이 안되는 문제가 있는 것으로 예상됨

taehan79-kim commented 1 month ago

추가실험 : co_dino_5scale_lsj_swin_tiny_8e를 이용하여 image size (1024, 1024), (1280, 1280), (1536, 1536)의 mAP50(Val)결과를 비교
- (1024, 1024) : 0.4160
- (1280, 1280) : 0.4220
- (1536, 1536) : 0.0720

taehan79-kim commented 4 weeks ago

추가실험 : ESDR를 이용하여 train image size(2048, 2048)로 만들고 CenterCrop 진행 후 데이터셋에 추가. 기존 (1280, 1280)에서 기학습된 모델의 웨이트를 불러와서 finetuning 진행. mAP50(Val)결과 추이를 확인.
- 0 epoch : "bbox_mAP": 0.635, "bbox_mAP_50": 0.724
- 1 epoch : "bbox_mAP": 0.556, "bbox_mAP_50": 0.658
- 2 epoch : "bbox_mAP": 0.567, "bbox_mAP_50": 0.667