WongKinYiu / yolor

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
GNU General Public License v3.0
1.98k stars 524 forks source link

Detection differences between YOLO PyTorch frameworks? #83

Closed mariusfm54 closed 2 years ago

mariusfm54 commented 2 years ago

I recently used ultralytics YOLOv3 archived repository to convert darknet weights to pytorch weights. I then ran inference on a set of images. Then, I used this yolor repository with the converted YOLOv3 Pytorch weights (and cfg file) to run inference on the same dataset: it appears results are way better, detection is more accurate. I am wondering why results are better with this repository: what's the difference between these two detectors? How comes that I can run inference using YOLOv3 weights with a YOLOR repository? I assume YOLOR reads my cfg file and detect these are YOLOv3 weights and then run YOLOv3 inference on my images but why are the results better than with the YOLOv3 repo then?

WongKinYiu commented 2 years ago

could you show example output of two repository?

mariusfm54 commented 2 years ago

Sure. This is what I get with YOLOR (accurately): image 593/1599 ../python_scripts/image/3655.jpg: 768x1280 Done. (0.022s) image 594/1599 ../python_scripts/image/3660.jpg: 768x1280 Done. (0.022s) image 595/1599 ../python_scripts/image/3665.jpg: 768x1280 Done. (0.022s) image 596/1599 ../python_scripts/image/3670.jpg: 768x1280 Done. (0.022s) image 597/1599 ../python_scripts/image/3675.jpg: 768x1280 Done. (0.022s) image 598/1599 ../python_scripts/image/3680.jpg: 768x1280 Done. (0.022s) image 599/1599 ../python_scripts/image/3685.jpg: 768x1280 Done. (0.022s) image 600/1599 ../python_scripts/image/3690.jpg: 768x1280 Done. (0.022s) image 601/1599 ../python_scripts/image/3695.jpg: 768x1280 1 Diamonds, Done. (0.022s) image 602/1599 ../python_scripts/image/370.jpg: 768x1280 Done. (0.022s) image 603/1599 ../python_scripts/image/3700.jpg: 768x1280 1 Diamonds, Done. (0.022s) image 604/1599 ../python_scripts/image/3705.jpg: 768x1280 1 Diamonds, 1 Rectangulars, 1 WarningSpeedLimits, Done. (0.022s) image 605/1599 ../python_scripts/image/3710.jpg: 768x1280 2 Diamonds, 1 Rectangulars, 1 WarningSpeedLimits, Done. (0.022s) image 606/1599 ../python_scripts/image/3715.jpg: 768x1280 1 Diamonds, 1 Rectangulars, 1 WarningSpeedLimits, Done. (0.022s) image 607/1599 ../python_scripts/image/3720.jpg: 768x1280 1 Diamonds, 1 Rectangulars, 1 WarningSpeedLimits, Done. (0.022s) image 608/1599 ../python_scripts/image/3725.jpg: 768x1280 Done. (0.022s) image 609/1599 ../python_scripts/image/3730.jpg: 768x1280 Done. (0.022s) image 610/1599 ../python_scripts/image/3735.jpg: 768x1280 Done. (0.022s) image 611/1599 ../python_scripts/image/3740.jpg: 768x1280 1 Rectangulars, Done. (0.022s) image 612/1599 ../python_scripts/image/3745.jpg: 768x1280 1 Rectangulars, Done. (0.022s) image 613/1599 ../python_scripts/image/375.jpg: 768x1280 Done. (0.022s) image 614/1599 ../python_scripts/image/3750.jpg: 768x1280 Done. (0.022s) image 615/1599 ../python_scripts/image/3755.jpg: 768x1280 1 Rectangulars, Done. (0.023s) image 616/1599 ../python_scripts/image/3760.jpg: 768x1280 2 Rectangulars, Done. (0.023s) image 617/1599 ../python_scripts/image/3765.jpg: 768x1280 Done. (0.022s)

This is what I get with the YOLOv3 repo: image 593/1599 ../../python_scripts/image/3655.jpg: 288x512 Done. (0.011s) image 594/1599 ../../python_scripts/image/3660.jpg: 288x512 Done. (0.011s) image 595/1599 ../../python_scripts/image/3665.jpg: 288x512 Done. (0.011s) image 596/1599 ../../python_scripts/image/3670.jpg: 288x512 Done. (0.011s) image 597/1599 ../../python_scripts/image/3675.jpg: 288x512 Done. (0.011s) image 598/1599 ../../python_scripts/image/3680.jpg: 288x512 Done. (0.011s) image 599/1599 ../../python_scripts/image/3685.jpg: 288x512 Done. (0.011s) image 600/1599 ../../python_scripts/image/3690.jpg: 288x512 Done. (0.011s) image 601/1599 ../../python_scripts/image/3695.jpg: 288x512 Done. (0.011s) image 602/1599 ../../python_scripts/image/370.jpg: 288x512 Done. (0.011s) image 603/1599 ../../python_scripts/image/3700.jpg: 288x512 Done. (0.011s) image 604/1599 ../../python_scripts/image/3705.jpg: 288x512 Done. (0.011s) image 605/1599 ../../python_scripts/image/3710.jpg: 288x512 1 Diamonds, Done. (0.011s) image 606/1599 ../../python_scripts/image/3715.jpg: 288x512 1 Diamonds, 1 Rectangulars, Done. (0.011s) image 607/1599 ../../python_scripts/image/3720.jpg: 288x512 1 Diamonds, Done. (0.011s) image 608/1599 ../../python_scripts/image/3725.jpg: 288x512 Done. (0.011s) image 609/1599 ../../python_scripts/image/3730.jpg: 288x512 Done. (0.011s) image 610/1599 ../../python_scripts/image/3735.jpg: 288x512 Done. (0.011s) image 611/1599 ../../python_scripts/image/3740.jpg: 288x512 Done. (0.011s) image 612/1599 ../../python_scripts/image/3745.jpg: 288x512 1 Rectangulars, Done. (0.011s) image 613/1599 ../../python_scripts/image/375.jpg: 288x512 Done. (0.011s) image 614/1599 ../../python_scripts/image/3750.jpg: 288x512 Done. (0.011s) image 615/1599 ../../python_scripts/image/3755.jpg: 288x512 Done. (0.011s) image 616/1599 ../../python_scripts/image/3760.jpg: 288x512 Done. (0.011s) image 617/1599 ../../python_scripts/image/3765.jpg: 288x512 Done. (0.011s)

WongKinYiu commented 2 years ago

oh, i mean output image with prediction.

WongKinYiu commented 2 years ago

by the way, the main reason i think it is due to inference size. in yolor, u use 768x1280, and in yolov3, u use 288x512.

mariusfm54 commented 2 years ago

So you were right, the inference sizes I used were the default ones, that's why they were different. This link was useful to understand inference size: https://github.com/ultralytics/yolov3/issues/232 It seems that the --img-size that should be chosen is the biggest dimension of the image I want to use.

With identical inference sizes I get similar result even though one class is never detected using YOLOv3 repo while it is detected using YOLOR... This is probably a bug in YOLOv3 repo as I think I should be able to use any Pytorch repo for inference and still get the same results (YOLOv3 or YOLOv4 or YOLOR).