AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.7k stars 7.96k forks source link

Can I modify tensor to appear as S*S(B*2+C)? #3498

Open Jin-S-Kim opened 5 years ago

Jin-S-Kim commented 5 years ago

In paper of YOLO(You Only Look Once:)Unified, Real-Time Object Detection , SS(B2+C) tensor of is used for detection. In YOLO input image is divided into S*S grid cell and B is number of bounding boxes that grid cell predicts and C is number of class can detect. I can modify C in file of format data but I don't know how can I modify S and C. In the paper, For evaluating YOLO on PASCAL VOC, they use S=7, B=2, C=20. How can I modify S and C in darknet?

AlexeyAB commented 5 years ago

It is about Yolo v1, that isn't supported now.

Currently in Yolo v3: S = width_in_cfg_file / 32 (for different [yolo]-layrs: /32 for 1st, /16 for 2nd, /8 for 3rd.) B = masks_in_yolo_layer (i.e. number of anchors in yolo layer) C = classes_in_yolo_layer

Jin-S-Kim commented 5 years ago

It is about Yolo v1, that isn't supported now.

Currently in Yolo v3: S = width_in_cfg_file / 32 (for different [yolo]-layrs: /32 for 1st, /16 for 2nd, /8 for 3rd.) B = masks_in_yolo_layer (i.e. number of anchors in yolo layer) C = classes_in_yolo_layer

Thanks for replying.

Are you saying that yolo layer is in yolov3.cfg? But why are there three yolo layers? YOLO knows that a grid cell divided by SS is a system that predicts B bounding boxes. Is this process three times in total? And there are three numbers in the mask. It predicts B bounding boxes randomly among these three numbers?

AlexeyAB commented 5 years ago

YOLO knows that a grid cell divided by SS is a system that predicts B bounding boxes.

This is for very old Yolo v1.

Read about Yolo v3: https://arxiv.org/pdf/1804.02767v1.pdf