DocF / multispectral-object-detection

Multispectral Object Detection with Yolov5 and Transformer
GNU Affero General Public License v3.0
315 stars 60 forks source link

Input shape of LLVIP YOLOV5L #33

Open XiongZhongxia opened 2 years ago

XiongZhongxia commented 2 years ago

Thanks for your contribution! Could you please tell me the input shape for training LLVIP YOLOV5L, which achieves 97.5 mAP@0.5 and 5.40 MR?

XueZ-phd commented 2 years ago

The same question confuses me. I look forward to getting an answer.

The author provides a pre-trained checkpoint named "yolov5l_transformerx3_llvip_s1024_bs32_e200". As this reply mentioned, the author uses 1024 x 1024 image shape to train yolov5l on llvip dataset.

But the image shape in this figure shows the image shape is 640 x 640 x 3. Another clue shows that the image shape is 640 x 640 x 3.

It is obviously important to use the same image shape for an fair comparison, thus i look forward to know this point.

XueZ-phd commented 2 years ago

The same question confuses me. I look forward to getting an answer.

The author provides a pre-trained checkpoint named "yolov5l_transformerx3_llvip_s1024_bs32_e200". As this reply mentioned, the author uses 1024 x 1024 image shape to train yolov5l on llvip dataset.

But the image shape in this figure shows the image shape is 640 x 640 x 3. Another clue shows that the image shape is 640 x 640 x 3.

It is obviously important to use the same image shape for an fair comparison, thus i look forward to know this point.

I can answer my question now! When training YOLOV5l on LLVIP, the image shape is 1024 x 1024 x 3. Please refer to this reply.

XiongZhongxia commented 1 year ago

The same question confuses me. I look forward to getting an answer. The author provides a pre-trained checkpoint named "yolov5l_transformerx3_llvip_s1024_bs32_e200". As this reply mentioned, the author uses 1024 x 1024 image shape to train yolov5l on llvip dataset. But the image shape in this figure shows the image shape is 640 x 640 x 3. Another clue shows that the image shape is 640 x 640 x 3. It is obviously important to use the same image shape for an fair comparison, thus i look forward to know this point.

I can answer my question now! When training YOLOV5l on LLVIP, the image shape is 1024 x 1024 x 3. Please refer to this reply.

Thanks a lot !