AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.73k stars 7.96k forks source link

YoloV3 Tiny PRN Training #4858

Open PaserSRL opened 4 years ago

PaserSRL commented 4 years ago

Hi all, I'm going crazy in order to train YoloV3 Tiny PRN, I've read all possible about it and trying to merge all information in a script that prepares an enviroment to train model according to required objects.

First of all YoloV3 Tiny PRN is absolutely good but I would like to remove useless objects (tie, pen, bottle...) so I created a scripts that executes following steps:

./script --evironmanet-name name --allowed-objects car,person,dog...

  1. Create a directory with a precise structure: ./environment-name/cfg/ ./environment-name/dataset/images/train2014/ ./environment-name/dataset/images/val2014/ ./environment-name/dataset/labels/train2014/ ./environment-name/dataset/labels/val2014/ -/environment-name/output/

  2. Copy all COCO dataset 2014 from source to my environment directory: coco_source (train images) -> ./environment-name/dataset/images/train2014/ coco_source (val images) -> ./environment-name/dataset/images/val2014/

  3. Label index correction My goal is to remove useless objects from detection so is necessary to generate a new file objects.names and objects will have a new index from original coco index according to my script parameter. so if original is: 0 - person 1 -bicycle 2 - car 3 - motorbike and my allowed objects are: 0 - person 1 - car I need to correct COCO labels with new index related to object (car - from index 2 must become 1) and remove all object coordinates related to objects not required. My script copy labels from source, edit them and place them to: ./environment-name/dataset/labels/train2014/ ./environment-name/dataset/labels/val2014/

  4. Create objects.names file in ./environment-name/objects.names

  5. Create model.data file in ./environment-name/model.data with content: classes=$(number_of_classes) train = ./environment-name/list.txt valid = ./environment-name/val.txt names = ./environment-name/objects.names backup = ./environment-name/output

  6. Create a file list.txt with all images available in ./environment-name/dataset/images/train2014/

  7. Create a file val.txt with all images available in ./environment-name/dataset/images/val2014/

  8. Copy source of yolov3-tiny-prn.cfg to ./environment-name/cfg/model.cfg and edit it following these rules:

9. Create a train.sh script that starts training with following command: ./darknet detector train '.$model_data_path.' '.$model_cfg_path.' yolov3-tiny-prn.conv.15 According to this ticket: https://github.com/AlexeyAB/darknet/issues/4091#issuecomment-542513900 Is necessary to use pretrainer weight so I generated it from yolov3-tiny-prn.weights with command: ./darknet partial cfg/yolov3-tiny-prn.cfg yolov3-tiny-prn.weights yolov3-tiny-prn.conv.15 15

Why each model that I try to train is very very far from results of original yolov3-tiny-prn (Lower mAP)? What I'm doing wrong?

This script is for my purpose but once it will be created and fully working I would like to share with the community.

AlexeyAB commented 4 years ago
WongKinYiu commented 4 years ago

If you only want to discard some of classes without adding new classes, it is better to modify load_weight function. https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L1959

PaserSRL commented 4 years ago

Chart chart

Screenshot Schermata da 2020-02-12 17-26-22

CFG Files (Extension changed due to github restrinctions) model.cfg.txt model.data.txt model.names.txt

I trained the model with only 2 classes: person,car I used entire COCO Dataset 2014.

PS: For this test I used 6000 iterations for class, so max_batches = 12000

AlexeyAB commented 4 years ago

Why each model that I try to train is very very far from results of original yolov3-tiny-prn (Lower mAP)? What I'm doing wrong?

Default yolov3-tiny-prn model has mAP@50, person = 48%, car = 33%, so avg ~= 40%, you got just 3% less (37% mAP@50)

PaserSRL commented 4 years ago

Default yolov3-tiny-prn model has mAP@50, person = 48%, car = 33%, so avg ~= 40%, you got just 3% less (37% mAP@50)

  • Try to train 50 000 - 100 000 iterations and change max_batches= ad steps=, since you use ~100 000 images of MS COCO

I will do.

  • Or just use default cfg/weights-file and add dont_show before each class in the coco.names file except car and person.

Really? this is new for me!

My test to train model is caused by I would like to understand how to train it in order to detect less and smaller objects. So the first point is undestand how to correctly train it =)

I will detect a person of 15x40 px in a frame of 416x416 px: detection_416x416

For the moment to solve this point I make 2 detections for each frame, in this way: detection_320x320

But this solutions reduce fps of 50%, so if there is the possibility to detect better small objects (also reducing unecessary classes) would be nice. I already found yolov3-tiny-3l but PRN has a better detection and also better perfomance.

Any suggestion? (Thanks a lot for your help!!!)

AlexeyAB commented 4 years ago

But this solutions reduce fps of 50%, so if there is the possibility to detect better small objects (also reducing unecessary classes) would be nice.

Instead of this: Train with width=416 height=416 in cfg-file then after training set width=576 height=320 in cfg and do just 1 detection instead of 2 detections.

PaserSRL commented 4 years ago

But this solutions reduce fps of 50%, so if there is the possibility to detect better small objects (also reducing unecessary classes) would be nice.

Instead of this: Train with width=416 height=416 in cfg-file then after training set width=576 height=320 in cfg and do just 1 detection instead of 2 detections.

Because, I've not understood what is the right image aspect ratio to supply to Yolo.

If I have a 16:9 (570x320 for example) frame:

AlexeyAB commented 4 years ago

Every image will be resized to the network size automatically.
You just should know that distortion of objects during Training and Detection should be approximately the same.

Follow general rule to calculate network size and images size: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

General rule - your training dataset should include such a set of relative sizes of objects that you want to detect:

train_network_width train_obj_width / train_image_width ~= detection_network_width detection_obj_width / detection_image_width train_network_height train_obj_height / train_image_height ~= detection_network_height detection_obj_height / detection_image_height I.e. for each object from Test dataset there must be at least 1 object in the Training dataset with the same class_id and about the same relative size:

object width in percent from Training dataset ~= object width in percent from Test dataset

That is, if only objects that occupied 80-90% of the image were present in the training set, then the trained network will not be able to detect objects that occupy 1-10% of the image.

PaserSRL commented 4 years ago

If I would like to train model a to a different resolution, can I use yolov3-tiny-prn.conv.15 (pre-trained at 416x416) ? Or I need to start from scratch?

AlexeyAB commented 4 years ago

If I would like to train model a to a different resolution, can I use yolov3-tiny-prn.conv.15 (pre-trained at 416x416)

Yes.

PaserSRL commented 4 years ago

If I would like to train model a to a different resolution, can I use yolov3-tiny-prn.conv.15 (pre-trained at 416x416)

Yes.

Thanks Alexey! You are a boss :)

fused-byte commented 4 years ago

To keep the distortion of a similar level in training and testing, Can we train the network with one width & height values and have different width & height values during inference?