AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.63k stars 7.95k forks source link

Problem using "network_predict" in C++ API #4343

Open frankiehe opened 4 years ago

frankiehe commented 4 years ago

@AlexeyAB Using your C++ API, "Detector" works but it returns "int" result "bbox_t" and it's not that accurate when the result is resized. I tried to use "network_predict(net, X);" in the C++ API instead, which returns "float" result "box"(I want to use "float" result), but it crashed! Similar code works in the C++ API provided by pjreddie/darknet, but it is much more slower then yours, so I want to use yours. Hope you can help me!

Thanks!

frankiehe commented 4 years ago

The code I use is similar to the "test_detector" code in your "detector.c"

frankiehe commented 4 years ago

My code here:


network net = *load_network_custom(".cfg,".weights",0,1);
float* resizeImg;
size_t resizeSize=net.w*net.h*3*sizeof(float);
resizeImg=(float*)malloc(resizeSize);
convertMatToFloat(image, resizeImg, net.w, net.h);
network_predict(net, resizeImg);
frankiehe commented 4 years ago

@AlexeyAB Could you please help me? Really need that module in C++ API.

Thanks!

AlexeyAB commented 4 years ago

@frankiehe

Using your C++ API, "Detector" works but it returns "int" result "bbox_t" and it's not that accurate when the result is resized.

Can you show examples?

Did you try to use such code? https://github.com/AlexeyAB/darknet/blob/63396082d7e77f4b460bdb2540469f5f1a3c7c48/src/yolo_console_dll.cpp#L653-L667

frankiehe commented 4 years ago

@AlexeyAB Thank you for your reply.I tried your code above, but it comes to the same result.

I think the root cause of my problem is here: The " detector.detect_resized" method returns a "std::vector<bbox_t>" result, and the bounding box struct "bbox_t" contains "unsigned int x, y, w, h". After Resizing the unsigned int "bbox_t" to the origin image, the error between "int" and "float" is augmented. When I use the "./darknet detect .cfg .weights *.jpg" method(actually using "network_predict" method) in Terminal provided by your packed application or the "network_predict(net, resizeImg)" method in the C++ API provided by pjreddie/darknet , the results are fine. Actually, the "network_predict(net, resizeImg)" method returns a "detection" result, and the bounding box struct "box" contains "float x, y, w, h", which is exactly what I want. But I can't use it in your C++ API, and it really concerned me.

Some code I use: The "detector.detect_resized" method:

cv::Mat image = cv::imread(filename); 
float* resizeImg;
size_t resizeSize=detector.get_net_width()*detector.get_net_height()*3*sizeof(float);
resizeImg=(float*)malloc(resizeSize);
convertMatToF3(image, resizeImg,  detector.get_net_width(), detector.get_net_height());//cv::Mat(CV_8UC3) to float*
image_t img = {detector.get_net_height(), detector.get_net_width(), 3, resizeImg};

std::vector<bbox_t> result_vec = detector.detect_resized(img, image.cols, image.rows, thresh);
result_vec = detector.tracking_id(result_vec);

The "network_predict" method:

cv::Mat image = cv::imread(filename); 
float* resizeImg;
size_t resizeSize=net.w*net.h*3*sizeof(float);
resizeImg=(float*)malloc(resizeSize);
convertMatToFloat(image, resizeImg, net.w, net.h);//cv::Mat(CV_8UC3) to float*
network_predict(net, resizeImg);
int nboxes=0;
detection *dets=get_network_boxes(net,image.cols,image.rows,thresh,0.5,0,1,&nboxes);

This code works in the C++ API provided by pjreddie/darknet, but it is much more slower then yours, so I want to use yours.

Really need help, thanks!

AlexeyAB commented 4 years ago

Why do you use convertMatToFloat(image, resizeImg, net.w, net.h);//cv::Mat(CV_8UC3) to float* instead of auto det_image = detector.mat_to_image_resize(mat_img); ?

What is the convertMatToFloat() function?

After Resizing the unsigned int "bbox_t" to the origin image, the error between "int" and "float" is augmented.

Show screenshot of image with wrong detections.

frankiehe commented 4 years ago

@AlexeyAB
The convertMatToFloat() function is just a self coded function using cv::resize and convert cv::Mat(CV_8UC3) to float. I think there is no difference between ```convertMatToFloat(image, resizeImg, net.w, net.h);//cv::Mat(CV_8UC3) to floatandauto det_image = detector.mat_to_image_resize(mat_img);since I compared the code, and **when I triedauto det_image = detector.mat_to_image_resize(mat_img);``` instead of my function, the results are the same. So I think it is not the key point of my problem**.

A screenshot of detection result using "detector.detect_resized" and its "bbox_t": image

Another screenshot of detection result using "network_predict", "get_network_boxes" and its "box": image

Both results use the same input image, yolo network, and parameters. As you can see, the detection result is not that "wrong" , but just not that accurate.

By the way, if the bounding box is larger than the object, or the resize ratio of origin image and resized image is not too big, the small difference between two detection results('int' and 'float') could hardly be discovered by human eyes. Only if you check the numerical results, you could notice the difference.

So I think the "detector.detect_resized" method works for most applications, but in my case, I really need the more accurate result. Hope you can help,thanks!

tomerBarkai commented 4 years ago

Hi @AlexeyAB ,

We think we encountered a problem that resembles the discussed issue.

We use cpp on windows.

We trained the network on our own data and it works great with the test line ("darknet.exe detector. test .." , but when using the dll we get bad predictions (the network detects "garbage" boxes randomly - i.e. false positives and almost none true positives) due to the different detect functions.

We used the same pictures on the same cfg files and same threshold.

here are two examples: detection using the network_predict (cmd line): WhatsApp Image 2020-06-15 at 13 39 14

detection using the dll detect function: WhatsApp Image 2020-06-15 at 13 39 13

while training we achieved 93% map so we believe it works great.

Is there any way to use the network_predict function in the dll instead of the detector.detect function?

Many thanks for your work! you made using a NN a very easy and intuitive task.