Detect desired class(es) using pretrained weights

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.65k stars 7.96k forks source link

Detect desired class(es) using pretrained weights #201

Closed MyVanitar closed 7 years ago

MyVanitar commented 7 years ago

Hi,

Consider we want to use the VOC pre-trained weights which consists of 20 trained classes. is it possible to just detect and show one or a selective number of of them, for example just person or pedestrian, or just person and sofa?

AlexeyAB commented 7 years ago

Hi,

Yes, just add one line - if(): https://github.com/AlexeyAB/darknet/issues/196#issuecomment-330042047

for(size_t i = 0; i < result_vec.size(); ++i) {
 bbox_t box =  result_vec[i];
 if(box.obj_id == 14 || box.obj_id == 17) {
     std::cout << box.obj_id << "\n"; // output to consolu each object id - each in new line
 }
 // do something else with box ...
}

Where 14 is a person and 17 is a sofa in VOC-names: https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/data/voc.names

MyVanitar commented 7 years ago

This is not related to the topic of this issue, but anyway I ask it here instead of opening a new one.

after training the model (latest commit), I see that the confidence on objects on training images is not 99% or 100%, why?

it should not be like this as I know. all objects inside training images should have 99% to 100% confidence, isn't it?

AlexeyAB commented 7 years ago

No, DNN can't remember all training images, so it can not predict them 100%, hence (it can not be sure) it can't has 100% probability that it has found the object correctly.

MyVanitar commented 7 years ago

if you look at images here: http://www.pyimagesearch.com/2017/09/11/object-detection-with-deep-learning-and-opencv/

apart from what article is about, it has many instances that detection is 99% or 100%

AlexeyAB commented 7 years ago

In SSD as I remember used only soft-max function for probability, so it can be 1.0 for 1 detected object (or ~0.9-0.99 for small number of detected objects with high probability): https://en.wikipedia.org/wiki/Softmax_function The same in Yolo v1.

But in Yolo v2 used soft-max() * logistic_activation(). So soft-max can be 1, but logistic-activation can't be 1 since there is an asymptote 1. / (1. + exp(-x)): https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions

I.e. in Yolo used: prob = IoU(box, object) = t0 * class-probability = logistic_activation(scale) * soft-max(prob_of_some_class)

where t0 can't be 1.

MyVanitar commented 7 years ago

Thank you

MyVanitar commented 7 years ago

How can I decide which object_id be shown on screen?

I mean the changes are made in the console, but also I want the visual detection also just show for example person.

Should I modify this condition if (obj_names.size() > i.obj_id) inside void draw_boxes? (yolo_console_dll project)

AlexeyAB commented 6 years ago

Change this line: https://github.com/AlexeyAB/darknet/blob/548a0bc652b562723695cc107f0844f11d1a2207/src/yolo_console_dll.cpp#L168

draw_boxes(cur_frame, result_vec, obj_names, 3, current_det_fps, current_cap_fps);

To these lines, here will be detected only airs (obj_id=0) and persons (obj_id=14):

result_vec.resize(std::remove_if(result_vec.begin(), result_vec.end(), 
    [](auto v) {return v.obj_id != 14 && v.obj_id != 0; }) - result_vec.begin());
draw_boxes(cur_frame, result_vec, obj_names, 3, current_det_fps, current_cap_fps);

MyVanitar commented 6 years ago

the pre-trained VOC weight does not work with yolo-voc.2.0.cfg, but it works with yolo-voc.cfg is there any pre-trained weight for the version 2.0?

MyVanitar commented 6 years ago

Let me ask this question here, instead of opening a new issue.

What is the role of the new added file, I mean kmeansiou.c

AlexeyAB commented 6 years ago

I added it just to not lose, Joseph used kmeansiou.c to calculate the anchors. For those who is easier to understand C than Python.