AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.66k stars 7.96k forks source link

subtleties of trainig #973

Open AshleyRoth opened 6 years ago

AshleyRoth commented 6 years ago

Hello, @AlexeyAB! I have a some questions about trainig:

  1. How can I train to see small objects? now I try to train on several road signs, I encountered a problem - my trained base sees a road sign only close. For example, a sign in the distance and a small one - the base does not recognize when the sign is already close - detection occurs. I tried to set the width and height to 808x808 - the spacing improved, but the performance dropped dramatically. Can as that to train differently? Do i need to change some parameters in the configuration?
  2. Which image is better to take? now I have an image of 1280x720, there are less. Does the image size have a value?

I so hope and wait your answer. Thanks!

AlexeyAB commented 6 years ago

@AshleyRoth @JoeMichael3 Hi,

  1. You should use yolov3.cfg model and change it:

    for training for small objects - set layers = -1, 4 instead of https://github.com/AlexeyAB/darknet/blob/6390a5a2ab61a0bdf6f1a9a6b4a739c16b36e0d7/cfg/yolov3.cfg#L720 and set stride=8 instead of https://github.com/AlexeyAB/darknet/blob/6390a5a2ab61a0bdf6f1a9a6b4a739c16b36e0d7/cfg/yolov3.cfg#L717

    Also set width=832 height=832 or 608x608. Set random=1 and train.


  1. Image should have such size, so you can see your objects. General rule - you should keep relative size of objects in the Training and Testing datasets roughly the same:

    • train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width

    • train_network_height * train_obj_height / train_image_height ~= detection_network_height * detection_obj_height / detection_image_height

More info: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

AshleyRoth commented 6 years ago

@AlexeyAB I saw in #977 about 2ed item, can you explain please?

The small object size varies from 20x20 to 100x100. The size of large image is at 3840x2160.
...
step 1: I cropped large images into smaller ones for training purpose, the size of cropped images vary from 245x199 to 1071x1005
...
step 3: I stopped training after 2000 iteration, then I tested with a 3840x2160 image. None of the small vehicles is detected. Only two large buildings (square size ) are detected.

So the smallest object size ~20x20. Also network size 416x416, image 3840x2160, and cropped image 245x199:

you trained with train_network_width * train_obj_width / train_image_width = 416 * 20 / 245 ~= 34
but you tested with detection_network_width * detection_obj_width / detection_image_width = 416 * 20 / 3840 ~= 2

I.e. 2 much lower than 34 - neural network can't detect in this case.

how can I get these values? The sizes of my images range from 600x800 to 1390x830

  1. One more thing! i set in cfg 608x608, layers = -1, 4 and stride=8. My GPU 1080 with 8GB. I run train and i get -nan. why? I had to set subdivisions 64, because the cuda error was.
Region 106 Avg IOU: 0.403059, Class: 0.380099, Obj: 0.513232, No Obj: 0.469630, .5R: 0.500000, .75R: 0.000000,  count: 2
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.535338, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.534018, .5R: -nan, .75R: -nan,  count: 0

and is it normal?: 29: 25215.802734, 30369.824219 avg loss, 0.000000 rate, 22.115349 seconds, 1856 images its after 30 min training. So big values....

Thanks!

AlexeyAB commented 6 years ago

how can I get these values?

Substitute your values in these formulas.

you trained with train_network_width * train_obj_width / train_image_width = but you tested with detection_network_width * detection_obj_width / detection_image_width =

and is it normal?: 29: 25215.802734, 30369.824219 avg loss, 0.000000 rate, 22.115349 seconds, 1856 images its after 30 min training. So big values....

Do you train only 29 iterations after 30 min training? Do you use GPU? You should train more 2000 iterations.

AshleyRoth commented 6 years ago

@AlexeyAB Yes, i use my GPU. I set batch and subvisions 64, because when below 64 i get every time memory error. height and weight i set 416x416

1600 iterations took 6 hours..

sorry for the stupid question, but I did not understand how to work with this and what values to put here. where to put what will happen after the calculations :(

you trained with train_network_width train_obj_width / train_image_width = but you tested with detection_network_width detection_obj_width / detection_image_width =