pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.85k stars 21.33k forks source link

Segfault at: darknet/src/image.c:604 im.data[k*w*h + i*w + j] = data[i*step + j*c + k]/255.; #918

Open pirobot opened 6 years ago

pirobot commented 6 years ago

We are trying to run YOLO V2 on an omnidirectional video stream (equirectangular) of size 3760x480 pixels and we almost immediately run into the a segfault which gdb identifies as follows:

ipl_into_image (src=<optimized out>, im=...) at home/patrick/jr2_catkin_ws/src/darknet_ros/darknet/src/image.c:604
604                     im.data[k*w*h + i*w + j] = data[i*step + j*c + k]/255.;

We have set width=3760 and height=480 in the cfg file and we get the same results whether we use yolov2-tiny or yolov2. We have tried going back to previous commits but we always get the same segfault.

Our environment is Ubuntu 16.04, CUDA 9.1. Thanks! patrick

rpfly3 commented 6 years ago

Same SIGSEGV here:

Thread 24200 "darknet_ros" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9d43c700 (LWP 2659)]
---Type <return> to continue, or q <return> to quit---
ipl_into_image (src=<optimized out>, im=...) at /home/pengfei/Documents/catkin_ws/src/darknet_ros/darknet/src/image.c:604
604                     im.data[k*w*h + i*w + j] = data[i*step + j*c + k]/255.0;
(gdb) bt
#0  ipl_into_image (src=<optimized out>, im=...) at /home/pengfei/Documents/catkin_ws/src/darknet_ros/darknet/src/image.c:604
#1  0x00007ffff78ef8da in darknet_ros::YoloObjectDetector::fetchInThread (this=0x7fffffffc7c0)
    at /home/pengfei/Documents/catkin_ws/src/darknet_ros/darknet_ros/src/YoloObjectDetector.cpp:410
#2  0x00007ffff75dac80 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff64526ba in start_thread (arg=0x7fff9d43c700) at pthread_create.c:333
#4  0x00007ffff6aa941d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Haven't find the reason yet.

rpfly3 commented 6 years ago

For now, I think the reason is src is released somewhere. So it is not accessible, which is verified by printing its address (not accessible). Here is my quick (dirty) fix:

void ipl_into_image(IplImage* src, image im)
{
    IplImage* temp = cvCloneImage(src);
    //unsigned char *data = (unsigned char *)src->imageData;
    unsigned char *data = (unsigned char *)temp->imageData;
    int h = src->height;
    int w = src->width;
    int c = src->nChannels;
    int step = src->widthStep;
    int i, j, k;

    for(i = 0; i < h; ++i){
        for(k= 0; k < c; ++k){
            for(j = 0; j < w; ++j){
                im.data[k*w*h + i*w + j] = data[i*step + j*c + k]/255.0;
            }
        }
    }
    cvReleaseImage(&temp);
}
pirobot commented 6 years ago

Awesome--this fixes the problem for me. Many thanks!

RhysMcK commented 6 years ago

I have also come across this problem. This above fix works for me aswell! Cheers @rpfly3

rpfly3 commented 6 years ago

@RhysMcK Glad it helps.

shilaimu commented 6 years ago

so!!!!!thanks for your share!!

clynamen commented 6 years ago

I can also confirm the bug and this fix.

opencv 3.4.3 cuda 10.0.130

BRNKR commented 5 years ago

fixed the problem for me too. i am using darknet_ros wrapper package for pjreddies darknet implementation

m0rph03nix commented 5 years ago

Thanks @rpfly3 . Fixed for me too (Process died with exit code -11) @pjreddie : Merge ?