Closed prabindh closed 7 years ago
I tried to replicate, and below is what I see:
./darknet-cpp detector demo cfg/combine9k.data cfg/yolo9000.cfg yolo9000.weights
Yeah, that is strange. The error occurs when I have GPU+CUDNN+OPENCV enabled. I have not tried it yet with GPU+CUDNN disabled. I'll test that on Monday and let you know how it goes.
I see, possibly I might hit it after a long time with GPU enabled too. With CPU, I hit it at the same spot very early. So looking at that first. If I fix it, I think the GPU mode will also be fixed.
This is fixed now in my trials - there is a critical bug in darknet (mainline) that is fixed in tag 3.76. Please check tag v3.76 https://github.com/prabindh/darknet/releases/tag/v3.76 . Once you check it at your end, please update.
There are many fixes, but the critical of them is in line 359 : src/region_layer.c (exceeding array bounds in probs array). This is what causes the heap corruption in question. The others are less critical, I do not believe they are causing the current issue.
For now, I have commented out the offending line pending further investigation if that line is really needed or not,
I pulled your changes and it seems to be working. Still slow. but so is the straight C version. Thanks for taking a look.
So this leaves us with a buffer overflow bug in the mainline code, hope Joseph reads the thread you had commented in.
Hello WillieMaddox, if this issue does not recur, could you kindly close the issue ? Will track it further in mainline issues.
Seems to be working fine now. I dont' have the option to close the issue.
Closing.
Copying from willie maddox comment in forum - https://groups.google.com/forum/#!topic/darknet/4Hb159aZBbA
I also am getting segfaults for ./darknet-cpp detector demo cfg/combine9k.data cfg/yolo9000.cfg yolo9000.weights. I located the error and it occurs when calling free(m.data) at the bottom of image.c. (which is called in the while loop by free_image(disp) in demo.c)
I pulled the latest version of darknet and did a fresh make clean; make I verified that the defaults in combine9k.data point to the correct files and that all the 9k.* files are present. I also tried running ./darknet-cpp detector demo cfg/coco.data cfg/yolo.cfg yolo.weights just for a sanity check and it ran with no errors. Also the straight c code version of darknet works fine with yolo9000. I put a breakpoint at free(m.data) and verified that the memory pointed to by m.data was a float (usually a value between 0.5 and 0.6 depending on the run) in each of the detector demo runs listed above.
But for some reason, yolo9000 crashes when trying to free m.data.