digitalbrain79 / pyyolo

Simple python wrapper for YOLO.
126 stars 65 forks source link

Same run time with or without GPU #14

Closed MHsela closed 7 years ago

MHsela commented 7 years ago

Hi, I got the same run time with/out GPU+CUDNN flag ("Predicted in 20.788111 seconds") Can you advice how to enable the GPU or what went wrong? Thanks

digitalbrain79 commented 7 years ago

Could you share Makefile and building process?

MHsela commented 7 years ago

Hi Thomaspark You can see below the Makefile and the build process. Thanks,

`GPU=1 CUDNN=1 DEBUG=0 OPENCV=1

ARCH= -gencode arch=compute_20,code=[sm_20,sm_21] \ -gencode arch=compute_30,code=sm_30 \ -gencode arch=compute_35,code=sm_35 \ -gencode arch=compute_50,code=[sm_50,compute_50] \ -gencode arch=compute_52,code=[sm_52,compute_52]

This is what I use, uncomment if you know your arch and want to specify

ARCH= -gencode arch=compute_52,code=compute_52

VPATH=./darknet/src LIB=libyolo.a OBJDIR=./obj/

CC=gcc AR=ar NVCC=nvcc OPTS=-Ofast COMMON= CFLAGS=-Wall -Wfatal-errors -Wno-unused-result -fPIC CFLAGS+=-I$(VPATH)

ifeq ($(DEBUG), 1) OPTS=-O0 -g endif

CFLAGS+=$(OPTS)

ifeq ($(OPENCV), 1) COMMON+= -DOPENCV CFLAGS+= -DOPENCV COMMON+= pkg-config --cflags opencv endif

ifeq ($(GPU), 1) COMMON+= -DGPU -I/usr/local/cuda/include/ CFLAGS+= -DGPU endif

ifeq ($(CUDNN), 1) COMMON+= -DCUDNN CFLAGS+= -DCUDNN endif

OBJ=libyolo.o gemm.o utils.o cuda.o deconvolutional_layer.o convolutional_layer.o list.o image.o activations.o im2col.o col2im.o blas.o crop_layer.o dropout_layer.o maxpool_layer.o softmax_layer.o data.o matrix.o network.o connected_layer.o cost_layer.o parser.o option_list.o darknet.o detection_layer.o captcha.o route_layer.o writing.o box.o nightmare.o normalization_layer.o avgpool_layer.o coco.o dice.o yolo.o detector.o layer.o compare.o regressor.o classifier.o local_layer.o swag.o shortcut_layer.o activation_layer.o rnn_layer.o gru_layer.o rnn.o rnn_vid.o crnn_layer.o demo.o tag.o cifar.o go.o batchnorm_layer.o art.o region_layer.o reorg_layer.o lsd.o super.o voxel.o tree.o ifeq ($(GPU), 1) OBJ+=convolutional_kernels.o deconvolutional_kernels.o activation_kernels.o im2col_kernels.o col2im_kernels.o blas_kernels.o crop_layer_kernels.o dropout_layer_kernels.o maxpool_layer_kernels.o network_kernels.o avgpool_layer_kernels.o endif

OBJS = $(addprefix $(OBJDIR), $(OBJ)) DEPS = $(wildcard src/*.h) Makefile

all: obj $(LIB)

$(LIB): $(OBJS) $(AR) rcs $@ $^

$(OBJDIR)%.o: %.c $(DEPS) $(CC) $(COMMON) $(CFLAGS) -c $< -o $@

$(OBJDIR)%.o: %.cu $(DEPS) $(NVCC) $(ARCH) $(COMMON) --compiler-options "$(CFLAGS)" -c $< -o $@

obj: mkdir -p obj

.PHONY: clean

clean: rm -rf $(OBJS) $(LIB) `

make -j4

nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/convolutional_kernels.cu -o obj/convolutional_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/deconvolutional_kernels.cu -o obj/deconvolutional_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/activation_kernels.cu -o obj/activation_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/im2col_kernels.cu -o obj/im2col_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/col2im_kernels.cu -o obj/col2im_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/blas_kernels.cu -o obj/blas_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/crop_layer_kernels.cu -o obj/crop_layer_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/dropout_layer_kernels.cu -o obj/dropout_layer_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/maxpool_layer_kernels.cu -o obj/maxpool_layer_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/network_kernels.cu -o obj/network_kernels.o nvcc -gencode arch=compute_52,code=compute_52 -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN --compiler-options "-Wall -Wfatal-errors -Wno-unused-result -fPIC -I./darknet/src -Ofast -DOPENCV -DGPU -DCUDNN" -c ./darknet/src/avgpool_layer_kernels.cu -o obj/avgpool_layer_kernels.o ar rcs libyolo.a obj/libyolo.o obj/gemm.o obj/utils.o obj/cuda.o obj/deconvolutional_layer.o obj/convolutional_layer.o obj/list.o obj/image.o obj/activations.o obj/im2col.o obj/col2im.o obj/blas.o obj/crop_layer.o obj/dropout_layer.o obj/maxpool_layer.o obj/softmax_layer.o obj/data.o obj/matrix.o obj/network.o obj/connected_layer.o obj/cost_layer.o obj/parser.o obj/option_list.o obj/darknet.o obj/detection_layer.o obj/captcha.o obj/route_layer.o obj/writing.o obj/box.o obj/nightmare.o obj/normalization_layer.o obj/avgpool_layer.o obj/coco.o obj/dice.o obj/yolo.o obj/detector.o obj/layer.o obj/compare.o obj/regressor.o obj/classifier.o obj/local_layer.o obj/swag.o obj/shortcut_layer.o obj/activation_layer.o obj/rnn_layer.o obj/gru_layer.o obj/rnn.o obj/rnn_vid.o obj/crnn_layer.o obj/demo.o obj/tag.o obj/cifar.o obj/go.o obj/batchnorm_layer.o obj/art.o obj/region_layer.o obj/reorg_layer.o obj/lsd.o obj/super.o obj/voxel.o obj/tree.o obj/convolutional_kernels.o obj/deconvolutional_kernels.o obj/activation_kernels.o obj/im2col_kernels.o obj/col2im_kernels.o obj/blas_kernels.o obj/crop_layer_kernels.o obj/dropout_layer_kernels.o obj/maxpool_layer_kernels.o obj/network_kernels.o obj/avgpool_layer_kernels.o

python setup_gpu.py build

running build running build_ext building 'pyyolo' extension creating build creating build/temp.linux-x86_64-2.7 x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c module.c -o build/temp.linux-x86_64-2.7/module.o In file included from libyolo.h:3:0, from module.c:5: ./darknet/src/image.h:107:1: warning: function declaration isn’t a prototype [-Wstrict-prototypes] image **load_alphabet(); ^ In file included from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ndarrayobject.h:27:0, from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/arrayobject.h:4, from module.c:3: /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/__multiarray_api.h:1453:1: warning: ‘_import_array’ defined but not used [-Wunused-function] _import_array(void) ^ creating build/lib.linux-x86_64-2.7 x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/module.o -L. -L/usr/local/cuda/lib64 -L/usr/local/ -lyolo -lcuda -lcudart -lcublas -lcurand -lcudnn -o build/lib.linux-x86_64-2.7/pyyolo.so

sudo python setup_gpu.py install

[sudo] password for : running install running build running build_ext running install_lib copying build/lib.linux-x86_64-2.7/pyyolo.so -> /usr/local/lib/python2.7/dist-packages running install_egg_info Removing /usr/local/lib/python2.7/dist-packages/pyyolo-0.1.egg-info Writing /usr/local/lib/python2.7/dist-packages/pyyolo-0.1.egg-info

MHsela commented 7 years ago

Hi Thomaspark, I have just rebuilt the Pyyolo and the Darknet and it solved this issue. Thanks!

digitalbrain79 commented 7 years ago

Good!