AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.7k stars 7.96k forks source link

How to build a yolo pipeline? #1492

Open ritwikdubey opened 6 years ago

ritwikdubey commented 6 years ago

Hi @AlexeyAB , Is it possible to build a detection pipeline within darknet? If so, how can I do that? Say in stage 1 I want to detect all faces in an image and then in stage 2 I want to perform face recognition of a known or unknown person.

Right now I perform face recognition on the whole image but I wonder if I use multistage detection will it improve the accuracy?

AlexeyAB commented 6 years ago

@ritwikdubey Hi,

There is no multi-stage prediction.

Yes, multi-stage inference can improve accuracy, if you will crop faces from original not-resized image and recognize each face in separate inference by using neural network that is optimal for face recognition.

Muhammad057 commented 6 years ago

Hello @ritwikdubey, I am also looking into the same problem these days. I have two separate trained YOLO models. One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.

Muhammad057 commented 6 years ago

Hello @AlexeyAB, regarding your comment 'implement multi-stage prediction by yourself inside Darknet-code', can you please point me out to that file, where I have to make changes for multi-stage prediction? Thank you.

AlexeyAB commented 6 years ago

@Muhammad057 To implement it for Detection on the images you should modify this function: https://github.com/AlexeyAB/darknet/blob/18d5e4f39c1441f2c21043ac3204b5cb279f8758/src/detector.c#L1083


  1. For example, add here: https://github.com/AlexeyAB/darknet/blob/18d5e4f39c1441f2c21043ac3204b5cb279f8758/src/detector.c#L1090 this code - where is resnet152.cfg/weights - your model for classification (face recognition, ...):
    network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
    load_weights(&net_classifier , "resnet152.weights");

  1. And add between these lines: https://github.com/AlexeyAB/darknet/blob/18d5e4f39c1441f2c21043ac3204b5cb279f8758/src/detector.c#L1142-L1143 the code for second stage recognition - something like this (may be I forgot something):

            for (i = 0; i < nboxes; ++i) {
                int class_id = -1;
                float prob = 0;
                for (j = 0; j < l.classes; ++j) {
                    if (dets[i].prob[j] > thresh && dets[i].prob[j] > prob) {
                        prob = dets[i].prob[j];
                        class_id = j;
                    }
                }
                if (class_id >= 0) {
                    image im_classify = crop_image(im, dets[i].bbox.x, dets[i].bbox.y, dets[i].bbox.w, dets[i].bbox.h);
                    image r = letterbox_image(im_classify, net_classifier.w, net_classifier.h);
                    float *predictions_classify = network_predict(net_classifier, r.data);
    
                    int top = 1;
                    top_k(predictions_classify , net_classifier.outputs, top, indexes);
                    int *indexes = calloc(top, sizeof(int));
    
                    for(i = 0; i < top; ++i){
                        int index = indexes[i];
                         printf("%s: %f\n",names[index], predictions_classify [index]);
                    }
                }
            }
ritwikdubey commented 6 years ago

I believe you meant net_classifier in load_weights function.

   network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
    load_weights(&net, "resnet152.weights");

BTW, why letterbox_image the image? I can't open the code files right now but I guess this function draws the colored box around the prediction and adds label to it, correct? Why would I want to add that before sending the cropped image for second prediction?

AlexeyAB commented 6 years ago

I believe you meant net_classifier in load_weights function.

Yes, I fixed it.

BTW, why letterbox_image the image? I can't open the code files right now but I guess this function draws the colored box around the prediction and adds label to it, correct? Why would I want to add that before sending the cropped image for second prediction?

No, it doesn't draw boxes. letterbox_image() resizes image. More about letterbox_image() and resize_image(): https://github.com/AlexeyAB/darknet/issues/232#issuecomment-336955485

Muhammad057 commented 6 years ago

@AlexeyAB Thanks for the reply. Ok, I have added the above code and recompiled the darknet.sln file. When I run the darknet.exe file, it takes the .cfg file and weights file from your below mentioned code to detect the license plate from a vehicle.

net_classifier = parse_network_cfg_custom("cfg/yolo-obj.cfg", 1); //yolo_obj is the test cfg file load_weights(&net_classifier , "yolo-obj_4600.weights"); //trained weights file

But, what I wanted to do is a little bit different. I run the trained YOLO on a live stream (rtsp link) through this command 'darknet.exe detector demo data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_4600.weights rtsp link',

and I save the frames of a live stream in 'result_img' folder. Now, I want to test each frame through another trained model (recognition model). Can you please help me to point out the relevant changes that I need to do in src/detector.c file? Where can I add the recognition model's cfg file, weights file, obj.data & obj.names file? Regards.

ritwikdubey commented 6 years ago

@Muhammad057 Good to know.

One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.

I sorta worked on similar problem. I found out that Yolo v3 was quite good in detecting numbers off license plate without character segmentation stage. I guess the way you're thinking is sufficient; 1- localize and segment the license plate region 2- Run prediction on the scaled license plate to detect the numbers. predictions

Muhammad057 commented 6 years ago

@ritwikdubey Yes, I have already detected the numbers from the license plate (which I call it my recognition model) using YOLOV3. But before detecting the license plate (LP) numbers, first, I segment out the license plate from a vehicle through my detection model and then, fed it to the recognition model (as started in my earlier comments). I think, YOLO doesn't have any pipeline for multi-stage prediction , i.e, first detect the LP and then recognize the letters from the same src/detector.c code. Below is the snippet of detecting the numbers from a LP recognized_image431 - copy

kmsravindra commented 5 years ago
    network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
    load_weights(&net_classifier , "resnet152.weights");

@AlexeyAB , Hi

I have a custom classifier (inceptionResNetV2) implemented in python/keras that takes cropped bounding boxes as image inputs. I want to plug that classifier on top of darknet detection. Could you please guide me as to where should I define that model, compile and load the trained weights in this darknet detector code and how do I do it? Thanks for your help!

Favi0 commented 5 years ago

@Muhammad057 Good to know.

One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.

I sorta worked on similar problem. I found out that Yolo v3 was quite good in detecting numbers off license plate without character segmentation stage. I guess the way you're thinking is sufficient; 1- localize and segment the license plate region 2- Run prediction on the scaled license plate to detect the numbers. predictions

what do you mean by scaled?

barzan-hayati commented 5 years ago

@Muhammad057 Are you able to use pipeline in order to combine two stage of plate detection and recognition? I found an article that did license plate recognition by one model not two.

Real-Time Brazilian License Plate Detection and Recognition Using Deep Convolutional Neural Networks

Muhammad057 commented 5 years ago

@barzan-hayati Yes I created pipeline to combine both of my models. Anyways thanks for sharing.

barzan-hayati commented 4 years ago

@Muhammad057 Could you please explain your method for pipeline?