AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.66k stars 7.96k forks source link

Why was performBatchDetect deleted? #6687

Open mazatov opened 4 years ago

mazatov commented 4 years ago

What is the reason to remove performBatchDetect function from darknet.py? Is there a plan to redo it with better performance?

I'm trying to recreate it myself using darknet.network_predict_batch, but I'm struggling with the input parameters to the function. In the original darknet.py, the way the images were processed was making batch processing actually slower than processing them one by one using darknet.detect_image. The majority of the time was spent on reshaping the arrays. Is there a faster way to prep images for darknet.network_predict_batch

This was the original image prep for performBatchDetect

    img_list = []
    for custom_image_bgr in image_list:
        custom_image = cv2.cvtColor(custom_image_bgr, cv2.COLOR_BGR2RGB)
        custom_image = cv2.resize(
            custom_image, (net_width, net_height), interpolation=cv2.INTER_NEAREST)
        custom_image = custom_image.transpose(2, 0, 1)
        img_list.append(custom_image)
    arr = np.concatenate(img_list, axis=0)
    arr = np.ascontiguousarray(arr.flat, dtype=np.float32) / 255.0
    data = arr.ctypes.data_as(POINTER(c_float))
    im = IMAGE(net_width, net_height, c, data)
    batch_dets = network_predict_batch(net, im, batch_size, pred_width,
                                                pred_height, thresh, hier_thresh, None, 0, 0)
lars-ek commented 4 years ago

I am also interested in batch inference, because I have a setup with two cameras and want to speed up the detection process by using batch=2 inference.

I also tried openCV-inference. For batch=1 I get a weak performance boost compared to darknet, but the very efficient ".detect" function does not support batch processing.

So my question is, what performance gain could I expect with batch-size=2 in darknet compared to precessing one-by-one frame? And if it is > 20%, how to do it with darknet?

Thank you.

lars-ek commented 3 years ago

I got batch-processing working with the latest darknet version, but I do not see any performance gain for batch size > 1...

mazatov commented 3 years ago

How did you get it to work? I got it to work the way I describe above, mimicking the older performBatchDetect, but also not getting any performance gain.

lars-ek commented 3 years ago

I used the above code and called the "network_predict_batch"-function in the compiled dll (yolo_cpp_dll.dll).

lars-ek commented 3 years ago

@AlexeyAB I expected that there should be a performance gain using batch detection, but there is none. Can this be? Thank you.

pfeatherstone commented 3 years ago

I posted a related issue here : https://github.com/AlexeyAB/darknet/issues/6846. There is a bug with network_predict_batch when inferring on GPU.

LeKristapino commented 3 years ago

The example for implementing predict_batch can be found here I would like to use the solution @pfeatherstone mentioned in the issue he posted, but I'm using python and not C and the resize_network function is not really mapped to python. At least not in the darknet.py file

But if there is no significant performance gain, then I guess it is not worth it for now

pfeatherstone commented 3 years ago

in theory, doing batch prediction should be faster since there are less memory transfers between CPU and GPU. Furthermore i think CUDNN algorithms are optimised for batch sizes which are powers of 2.

pfeatherstone commented 3 years ago

With regards to calling stuff in python, i think the easiest thing to do is setup the code correctly in either C or C++, then write a python binding using pybind11. Using pybind11 is super easy

LeKristapino commented 3 years ago

@pfeatherstone Well, there are already mappings for python from compiled C++ file for Darknet used currently in Yolo. But the problem is that the specific resize_network function is not mapped - maybe you could specify which C++ file I have to look at to see if the function is possible to map in darknet.py file?

pfeatherstone commented 3 years ago

If you grep resize_network in darknet folder you will find the C file where it is declared. You will have to modify the python bindings to expose it to python.

pfeatherstone commented 3 years ago

Again, I found that there is a bug in darknet when doing batch inference using GPUs, when you modify the batchsize at runtime.

pfeatherstone commented 3 years ago

If you set the batchsize in the .cfg file, you will probably be alright, but that's no good if you want to change the batchsize depending on the inputs.