Open ritwikdubey opened 6 years ago
@ritwikdubey Hi,
There is no multi-stage prediction.
Yes, multi-stage inference can improve accuracy, if you will crop faces from original not-resized image and recognize each face in separate inference by using neural network that is optimal for face recognition.
Hello @ritwikdubey, I am also looking into the same problem these days. I have two separate trained YOLO models. One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.
Hello @AlexeyAB, regarding your comment 'implement multi-stage prediction by yourself inside Darknet-code', can you please point me out to that file, where I have to make changes for multi-stage prediction? Thank you.
@Muhammad057 To implement it for Detection on the images you should modify this function: https://github.com/AlexeyAB/darknet/blob/18d5e4f39c1441f2c21043ac3204b5cb279f8758/src/detector.c#L1083
network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
load_weights(&net_classifier , "resnet152.weights");
And add between these lines: https://github.com/AlexeyAB/darknet/blob/18d5e4f39c1441f2c21043ac3204b5cb279f8758/src/detector.c#L1142-L1143 the code for second stage recognition - something like this (may be I forgot something):
for (i = 0; i < nboxes; ++i) {
int class_id = -1;
float prob = 0;
for (j = 0; j < l.classes; ++j) {
if (dets[i].prob[j] > thresh && dets[i].prob[j] > prob) {
prob = dets[i].prob[j];
class_id = j;
}
}
if (class_id >= 0) {
image im_classify = crop_image(im, dets[i].bbox.x, dets[i].bbox.y, dets[i].bbox.w, dets[i].bbox.h);
image r = letterbox_image(im_classify, net_classifier.w, net_classifier.h);
float *predictions_classify = network_predict(net_classifier, r.data);
int top = 1;
top_k(predictions_classify , net_classifier.outputs, top, indexes);
int *indexes = calloc(top, sizeof(int));
for(i = 0; i < top; ++i){
int index = indexes[i];
printf("%s: %f\n",names[index], predictions_classify [index]);
}
}
}
I believe you meant net_classifier
in load_weights
function.
network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
load_weights(&net, "resnet152.weights");
BTW, why letterbox_image
the image? I can't open the code files right now but I guess this function draws the colored box around the prediction and adds label to it, correct? Why would I want to add that before sending the cropped image for second prediction?
I believe you meant net_classifier in load_weights function.
Yes, I fixed it.
BTW, why letterbox_image the image? I can't open the code files right now but I guess this function draws the colored box around the prediction and adds label to it, correct? Why would I want to add that before sending the cropped image for second prediction?
No, it doesn't draw boxes. letterbox_image()
resizes image. More about letterbox_image()
and resize_image()
: https://github.com/AlexeyAB/darknet/issues/232#issuecomment-336955485
@AlexeyAB Thanks for the reply. Ok, I have added the above code and recompiled the darknet.sln file. When I run the darknet.exe file, it takes the .cfg file and weights file from your below mentioned code to detect the license plate from a vehicle.
net_classifier = parse_network_cfg_custom("cfg/yolo-obj.cfg", 1); //yolo_obj is the test cfg file load_weights(&net_classifier , "yolo-obj_4600.weights"); //trained weights file
But, what I wanted to do is a little bit different. I run the trained YOLO on a live stream (rtsp link) through this command 'darknet.exe detector demo data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_4600.weights rtsp link',
and I save the frames of a live stream in 'result_img' folder. Now, I want to test each frame through another trained model (recognition model). Can you please help me to point out the relevant changes that I need to do in src/detector.c file? Where can I add the recognition model's cfg file, weights file, obj.data & obj.names file? Regards.
@Muhammad057 Good to know.
One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.
I sorta worked on similar problem. I found out that Yolo v3 was quite good in detecting numbers off license plate without character segmentation stage. I guess the way you're thinking is sufficient; 1- localize and segment the license plate region 2- Run prediction on the scaled license plate to detect the numbers.
@ritwikdubey Yes, I have already detected the numbers from the license plate (which I call it my recognition model) using YOLOV3. But before detecting the license plate (LP) numbers, first, I segment out the license plate from a vehicle through my detection model and then, fed it to the recognition model (as started in my earlier comments). I think, YOLO doesn't have any pipeline for multi-stage prediction , i.e, first detect the LP and then recognize the letters from the same src/detector.c code. Below is the snippet of detecting the numbers from a LP
network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1); load_weights(&net_classifier , "resnet152.weights");
@AlexeyAB , Hi
I have a custom classifier (inceptionResNetV2) implemented in python/keras that takes cropped bounding boxes as image inputs. I want to plug that classifier on top of darknet detection. Could you please guide me as to where should I define that model, compile and load the trained weights in this darknet detector code and how do I do it? Thanks for your help!
@Muhammad057 Good to know.
One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.
I sorta worked on similar problem. I found out that Yolo v3 was quite good in detecting numbers off license plate without character segmentation stage. I guess the way you're thinking is sufficient; 1- localize and segment the license plate region 2- Run prediction on the scaled license plate to detect the numbers.
what do you mean by scaled?
@Muhammad057 Are you able to use pipeline in order to combine two stage of plate detection and recognition? I found an article that did license plate recognition by one model not two.
Real-Time Brazilian License Plate Detection and Recognition Using Deep Convolutional Neural Networks
@barzan-hayati Yes I created pipeline to combine both of my models. Anyways thanks for sharing.
@Muhammad057 Could you please explain your method for pipeline?
Hi @AlexeyAB , Is it possible to build a detection pipeline within darknet? If so, how can I do that? Say in stage 1 I want to detect all faces in an image and then in stage 2 I want to perform face recognition of a known or unknown person.
Right now I perform face recognition on the whole image but I wonder if I use multistage detection will it improve the accuracy?