yolov3-tiny_xnor.cfg running on ARM

joaomiguelvieira commented 5 years ago

Hi @AlexeyAB,

I am trying to run yolov3-tiny_xnor.cfg for detection in a raspberry pi. I have trained the network, tested it on an Intel-based system and it just works fine. However, when I run it on the RPi, nothing is detected! I am using the very same command and the very same version of the framework on both sides. Can you help me figure out what is going on?

I am using the command ./darknet detector test data/coco.data cfg/yolov3-tiny_xnor.cfg yolov3-tiny_xnor_last.weigths data/person.jpg

The content of coco.data is

classes = 80
names   = data/coco/coco.names
backup  = backup/

The content of yolov3-tiny_xnor.cfg

[net]
# Testing
batch=1
subdivisions=1
# Training
# batch=64
# subdivisions=2
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 500200
policy=steps
steps=400000,450000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
xnor=1
bin_output=1
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
xnor=1
bin_output=1
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
xnor=1
bin_output=1
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
xnor=1
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
xnor=1
bin_output=1
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1

[convolutional]
xnor=1
bin_output=1
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

###########

[convolutional]
xnor=1
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=255
activation=linear

[yolo]
mask = 3,4,5
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
classes=80
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

[route]
layers = -4

[convolutional]
xnor=1
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = -1, 8

[convolutional]
xnor=1
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=255
activation=linear

[yolo]
mask = 0,1,2
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
classes=80
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

The .weights file can be found here: http://www.mediafire.com/file/vahpux9xefw1tci/yolov3-tiny_xnor_122000.weights

Finally, the person.jpg image is the one already present in the data folder.

AlexeyAB commented 4 years ago

@Hugh-Chang Un-comment this part of code: https://github.com/AlexeyAB/darknet/blob/342a8d1561c19317f2d5fda0f099449b79b51716/src/network_kernels.cu#L119-L136

of add if (i == 10) { and } around this code to show only layer-10.

Yingxiu-Chang commented 4 years ago

@AlexeyAB Thx. I will take a try.

Yingxiu-Chang commented 4 years ago

@AlexeyAB Hi, I tried ur modifies and it still remains some problems. The code I modified as below. if (i == 10) { cuda_pull_array(l.output_gpu, l.output, l.batchl.outputs); if (l.out_w >= 0 && l.out_h >= 1 && l.c >= 3) { int j; for (j = 0; j < l.out_c; ++j) { image img = make_image(l.out_w, l.out_h, 3); memcpy(img.data, l.output + l.out_wl.out_hj, l.out_wl.out_h 1 sizeof(float)); memcpy(img.data + l.out_wl.out_h 1, l.output + l.out_wl.out_hj, l.out_wl.out_h 1 sizeof(float)); memcpy(img.data + l.out_wl.out_h 2, l.output + l.out_wl.out_hj, l.out_wl.out_h 1 sizeof(float)); char buff[256]; sprintf(buff, "layer-%d slice-%d", i, j); show_image(img, buff); save_image(img, buff); } cvWaitKey(0); // wait press-key in console cvDestroyAllWindows(); } }

The bugs when I released were identifier "cvWaitKey" is undefined darknet D:\darknet\darknet-master\src\network_kernels.cu 134

identifier "cvDestroyAllWindows" is undefined darknet D:\darknet\darknet-master\src\network_kernels.cu 135

MSB3721 命令“"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin\nvcc.exe" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_75,code=\"sm_75,compute_75\" --use-local-env -ccbin "F:\Visual Studio\VC\bin\x86_amd64" -x cu -IC:\opencv\build\include -IC:\opencv_3.0\opencv\build\include -I....\include -I....\3rdparty\stb\include -I....\3rdparty\pthreads\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include" -I\include -I\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -DOPENCV -DCUDNN_HALF -DCUDNN -D_TIMESPEC_DEFINED -D_SCL_SECURE_NO_WARNINGS -D_CRT_SECURE_NO_WARNINGS -D_CRT_RAND_S -DGPU -DWIN32 -D_CONSOLE -D_LIB -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Fdx64\Release\vc140.pdb /FS /Zi /MD " -o x64\Release\network_kernels.cu.obj "D:\darknet\darknet-master\src\network_kernels.cu"”已退出，返回代码为 1。 darknet C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\BuildCustomizations\CUDA 10.0.targets 712

AlexeyAB / darknet

yolov3-tiny_xnor.cfg running on ARM #2382