YOLO4 Struggle to improve detection / Advice

depauwjimmy commented 3 years ago

I have been trying to detect some outdoor electrical equipments and have problem with one specifically. Here's an example :

Class 0

There's is many different model but this is the most extreme i have. Very long, thin, without too much features except the different shape in the middle.

My dataset is 100% generated and provides good results for other more squared shaped objects except this one.

What i do is :

Create overlays with transparent background of all the objects i need, enough samples of them. I also rotate them manually.
Put those overlays over random background with random position and object scale. Bounding box is also modified to fit.
Use python lib imgaug to augment my images (colors, blur, noise,...) never rotation since it messes up the bounding box. I have taken care of not having generation issues with bad bounding boxes after picture augmentation.

Validation set is composed of real pictures though to be closest to reality.

Yolo configuration is untouched except for the required changes since i only have 3 classes.

[net]
batch=64
subdivisions=64
width=640
height=640
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 15000
policy=steps
steps=12000,13500
scales=.1,.1

#cutmix=1
mosaic=1

With my 8GB of vram i was never able to use anything else but 64 for subdivisions.

This is my best result so far (this object is class 0)

 detections_count = 1541, unique_truth_count = 1025  
class_id = 0, name = manchon, ap = 47.84%        (TP = 68, FP = 35) 
class_id = 1, name = isolateur, ap = 68.82%      (TP = 359, FP = 38) 
class_id = 2, name = blocbifilaire, ap = 89.08%      (TP = 266, FP = 10) 

 for conf_thresh = 0.25, precision = 0.89, recall = 0.68, F1-score = 0.77 
 for conf_thresh = 0.25, TP = 693, FP = 83, FN = 332, average IoU = 70.54 %

Chart

Is there anything that i could change to improve the detection of such object?

Class 1 is not great as well but the issue is different. There is no as much variation in model but it is basically a section that can be repeated a various amount of time.

Class 1

In this there is 6 section but it can have from 4 to 20+ Shape of the section can vary a bit as well as the color (from deep green to almost transparent) My dataset contains overlays with a various amount of sections but long one have a bad detection rate.

Any advice would be greatly appreciated.

OkuChou commented 3 years ago

How many data sets you employed for each class? Why you set max_batchs = 15000? Your GPU vram 8GB looks small, it seems will take very long time for trainning? Also the width and height both are 640pxs, how 8GB vram could run such a big size image for trainning? Curious ur cfg settings.

depauwjimmy commented 3 years ago

Dataset is 100% generated so i can have as many as i want but i have around 10k images. I just put 15k batch so i am sure it continues during the night but i usually stops between 9k and 11k because i see no improvements. For training i either have a single RTX 2070 or a dual GPU M60 but either way it's only 8GB per GPU. Yeah it takes a while, around 30 hours.

640x640 is the maximum i can run with subdivisions=64 and that's the only change (outside of classes changes of) from base yolov4-custom.cfg

I don't know what the result would be with lower size though. I also don't know what would be best, having random augmented images in my dataset initially or not at all and count on what is built-in.

OkuChou commented 3 years ago

30hrs is too long. i think d'better to reduce the number of your dataset or parameters to make ur model looks more compact. Then u can do more trainning and compare the results, and decide to raise your dataset number, size or iterations.

depauwjimmy commented 3 years ago

If this is what it takes to have the best accuracy than so be it. I doubt making my model more compact would help with that. I am not even sure what you mean by that, using yolo4-tiny or another variant like mish or csp? Have no idea what mish/csp are since there's not a lot of details about those but i can give them a try if it can help. I can lower the amount of pictures in my dataset but again i don't see how that would help getting a better accuracy.

Goru1890 commented 3 years ago

I think mish and csp requires a lot of VRAM, try using a yolo4-tiny with the new pre-trained weight.

bulatnv commented 3 years ago

Hello @depauwjimmy.

1) Finish training process, at the end (steps=12000,13500 after 12000 and 135000 iteration) darknet lowers learning rate 10 times. As result you would see significant increase in accuracy. Generally, in my cases mAP increase for 10%. 2) Try to increase minibatch. From my experience minibatch = 1 is not sufficient to train models properly. You can train (tolov4tiny, decrease train resolution, better hardware). 3) Try to balance classes using oversampling technique (geometrical (rotations, scaling) augmentations help). (Undersampling generally gives worse results). 4) Add negative samples to train. 5) ***For experts (very tricky to do properly). Recalculate anchors.

All the best )

Goru1890 commented 3 years ago

Finish training process, at the end (steps=12000,13500 after 12000 and 135000 iteration) darknet lowers learning rate 10 times. As result you would see significant increase in accuracy. Generally, in my cases mAP increase for 10%.

How many max_batches do you use? And which learning_rate?

bulatnv commented 3 years ago

How many max_batches do you use?

Most of the time I have used 150_000 (for small datasets - 2 classes, 25000 images), 300_000, 600_600 (for 50 classes and up to 200_000 train samples). It depends on the size of train dataset and number of classes. Alexey suggest to train at least 2000*classes iterations.

Usually sufficient 2000 iterations for each class(object), but not less than number of training images and not less than 6000 iterations in total https://github.com/AlexeyAB/darknet#when-should-i-stop-training

And which learning_rate?

I do not change learning rate.

versavel commented 3 years ago

Jimmy, have you made any progress ? I'm curious to learn from your findings.

depauwjimmy commented 3 years ago

Back from holiday so started on this again. To make things easier i am focusing on the one class i have the worst result. Having only one class will make the testing quicker. Once i have a better result i can do all at once again. So now i have a 4200 pictures dataset and i'll do a 6K iterations.

@bulatnv What is minibatch? I don't see it in the config file or anywhere

depauwjimmy commented 3 years ago

And upon reading the answers i have a feeling that there would be no way to properly train yolo4 with 8GB of VRAM? I don't want to use tiny, i can run yolo4 on my production platform so i want to have the best accuracy. But if it is impossible to have proper results with 8GB then i wish it was mentioned somewhere.

stephanecharette commented 3 years ago

And upon reading the answers i have a feeling that there would be no way to properly train yolo4 with 8GB of VRAM?

I have a GPU with only 8 GiB of vram, and darknet/yolo works fine for me. You probably should start with tiny. Almost all my client projects are done using tiny. Those that aren't typically use tiny-3l. The only times I've needed the full yolo was when I'm playing around with things, or to compare yolo and yolo-tiny. But regardless, 8 GiB works fine: https://www.ccoderun.ca/programming/darknet_faq/#memory_consuption

versavel commented 3 years ago

minibatch size = batch/subdivisions So if you set batch=64 and subdivisions=4, then minibatch size will be 16.

I agree with @stephanecharette, starting with a tiny model. Faster to train and validate and most likely accurate enough for your application. When the model trains faster, you can spend more time on training with different parameters, and thus more quickly optimize. Training a "full" model could be part of a fine-tuning effort at a later stage in the application development, if at that time you deem it's necessary to increase accuracy.

akashAD98 commented 3 years ago

my YOLOV4-mish custom model loss is not decreasing it's in the range of 35 to 25, how can I minimize the loss ?? I'm training 29 classes having 50k training dataset.

LiaoSteve commented 3 years ago

if you use random=1 in [yolo]? you could try:

[net]
batch=64
subdivisions=32 
width=416
height=416

[yolo]
random=1

AlexeyAB / darknet

YOLO4 Struggle to improve detection / Advice #7122