Open zhengthomastang opened 6 years ago
hi~ I am also the newest user and get this problem in the program. Is anyone solve this error? Regard
I faced this problem too a few days ago. I "solved" it by compiling Darknet's code together with my application. I suspect it has something to do with the mixing of nvcc
and gcc
/clang
, specifically with code and/or data address space attribution.
I have a similar problem. Can be reproduced by basically this main.c
:
#include <stdio.h>
#include <stdlib.h>
#include <darknet.h>
// Paste test_detector from examples/detector.c here
int main(int argc, char* argv[])
{
cuda_set_device(0);
char* datacfg = "cfg/coco.data";
char* cfg = "cfg/yolov3.cfg";
char* weights = "yolov3.weights";
char* filename = "data/dog.jpg";
test_detector(datacfg, cfg, weights, filename, .5, .5, 0, 0);
return 0;
}
Using the test_detector
function from examples/detector.c
. It will not work with the default test_detector
function, because the line layer l = net->layers[net->n-1];
will return an uninitialised layer struct (i.e. l.classes
will be 0
instead of 80
, which is the default for the default YOLO v3 config). The result is that nothing gets detected.
Replacing the occurences of l.classes
with a hard-coded 80
will make it work, tho.
Is there anything I can do to make it work, and still use darknet as a shared object? And obviously without hard-coding such stuff.
hey @0xf3rn4nd0 can you share your cmake file which. thank you in advance.
I also come across this issue, but I dont think it
s the problem of nvcc or gcc. add -DGPU in your Makefile can be a solution
My Makefile starts with
GPU=1
...
Still I have this problem. Like OP said, the error is gone for CPU mode
@christiandreher does the problem persists if you compile darknet's library with -fPIC
?
@arjun-kava Sorry for taking so long to reply. I didn't use cmake, just make
like Darknet itself. Here's an example where I was also using OpenCV. There are 3 targets:
darknet_cuda
: Compiles all *.cu files from Darknetdarknet_framework
: Compiles all *.c file Darknetmain
: Compiles the application that uses Darknet by linking with the *.o files that were created in the previous targets.Then I do:
$ make darknet-cuda
$ make darknet-framework
$ make main
OPENCV_DIR = $(shell pwd)/ext/opencv
DARKNET_DIR = $(shell pwd)/ext/darknet
BIN_DIR = $(shell pwd)/bin
NVCC = nvcc
NV_CFLAGS = -DGPU -DCUDNN -I$(DARKNET_DIR)/include \
-gencode arch=compute_60,code=sm_60 -c
NV_CFLAGS_EXTRA = -Wall -fPIC
NV_LFLAGS = -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand -lcudnn
DARKNET_CFLAGS = -DGPU -DCUDNN -Wall -O3 -march=native \
-I/usr/local/cuda/include/ -I$(DARKNET_DIR)/include -c
CC = clang
CFLAGS = -DGPU -DCUDNN -std=c++11 -Wall -O3 -g -I$(OPENCV_DIR)/include \
-I/usr/local/cuda/include/ -I$(DARKNET_DIR)/include -fPIC
LFLAGS = -L$(OPENCV_DIR)/lib \
-Wl,-rpath,$(OPENCV_DIR)/lib \
-L/usr/local/cuda/lib64 \
-lopencv_core -lopencv_videoio -lopencv_imgproc \
-lopencv_highgui -lstdc++ -lpthread -lm \
-lcuda -lcudart -lcublas -lcurand -lcudnn
darknet-cuda: $(DARKNET_DIR)/src/*.cu
mkdir -p $(BIN_DIR)
pushd $(BIN_DIR) > /dev/null && \
$(NVCC) $? $(NV_CFLAGS) --compiler-options "$(NV_CFLAGS_EXTRA)" && \
popd > /dev/null
darknet-framework: $(DARKNET_DIR)/src/*.c
mkdir -p $(BIN_DIR)
pushd $(BIN_DIR) > /dev/null && \
$(CC) $? $(DARKNET_CFLAGS) && \
popd > /dev/null
main: main.cpp $(BIN_DIR)/*.o
mkdir -p $(BIN_DIR)
$(CC) $? $(CFLAGS) $(LFLAGS) -o $(BIN_DIR)/$@
clean:
rm -rf $(BIN_DIR)
By the way, Darknet's examples also compile everything together. Maybe it is easier to start from there since you already know it is working.
@0xf3rn4nd0 Thank you for your feedback.
I am quite certain that I did not change anything in the Makefile since then, and looking at it now, there is this line:
CFLAGS=-Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC
# ^^^^^
So it seems that it was compiled with -fPIC
in first place. I also just recompiled Darknet (just to be sure) and tried to reproduce the error and it still doesn't work for me. So, I do not get an error, but l.classes
is zero.
Please take note that my code changed a lot since then and this is not really an issue for me anymore. However, if you're trying to debug this, feel free to ask me anything.
Why there is srand(2222222);
? Can anybody suggest?
@bvnp43 It is just some arbitrary number to seed the PRNG
@christiandreher Can you solve this problem? How to solve it?
@ou525 No, like I already mentioned: I worked around it since I am reading the names-file anyway. Doing so, I already have access to the total amount of classes anyway, so I just pass the size of the names-vector (I am using C++) instead of l.classes
.
However, if you know the total amount of classes in advance, you can just hard-code the number to work around it. It's not pretty, but it should work. Otherwise you can also just read the names-file and count the classes.
I am certain that there's a better way or a fix even, but for me, it was not worth the effort to investigate further. Hope this helps...
@christiandreher thanks. @ou525 There is another fork of this repo which contain some cpu/gpu optimizations and bug fixes - https://github.com/AlexeyAB/darknet maybe this bug is not present there. I'm working with that fork.
@christiandreher thank you, I also solved this. Does not affect the use, it may not need to continue research.
@bvnp43 thanks, let me try
I am running into this and was trying to avoid having to build darknet within my application.
Are there any other things to try? I am building with the Intel compiler (icc) and assuming nvcc and it may not be getting along.
I solved this by adding GPU
/ CUDNN
macro definition before include darknet.h
The problem is, when change CPU and/or CUDNN to 1 in the Makefile, the struct layer
will change it's fields and layout accordingly
PS: I found many functions have parameter or return type of struct layer (not a const pointer type) in the source code, which means memory copying when passing layers into/out of these functions in my understanding, and it's more than 1.5K memory actually.
I followed the example of
test_detector()
in examples/detector.c to make a C++ interface for YOLO. The code is compiled using g++ and C++11 (the darknet.h file and libdarknet.so file must be in the same folder):g++ main.cpp -L. -ldarknet -o main -std=c++11
It works perfectly in CPU mode (GPU=0, CUDNN=0), however, it fails in GPU mode (GPU=1, CUDNN=1). The problem occurs in network layer extraction which is at line 36:
layer l = net->layers[net->n-1];
In both modes,
net->n-1
always returns 31. In CPU mode, the extracted network layer appears to be normal, with the size of l.w==19, l.h==19 and l.n==5 for data/dog.jpg. But in GPU mode, the extracted network layer is empty, with the size of l.w==0, l.h==0 and l.n==0. This error will lead to segmentation fault at line 60.My complete code is attached as follows.