Open IanButterworth opened 4 years ago
@ianshmean Hi, Nice work!
Try to comment these two lines for compiling on Windows:
If it will help, you can do PR.
That helps, thanks a lot! However, we have also set LDFLAGS="-lws2_32"
. Without that, I was getting this error:
g++ -shared -std=c++11 -fvisibility=hidden -DLIB_EXPORTS -Iinclude/ -I3rdparty/stb/include -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -fPIC ./obj/image_opencv.o ./obj/http_stream.o ./obj/gemm.o ./obj/utils.o ./obj/dark_cuda.o ./obj/convolutional_layer.o ./obj/list.o ./obj/image.o ./obj/activations.o ./obj/im2col.o ./obj/col2im.o ./obj/blas.o ./obj/crop_layer.o ./obj/dropout_layer.o ./obj/maxpool_layer.o ./obj/softmax_layer.o ./obj/data.o ./obj/matrix.o ./obj/network.o ./obj/connected_layer.o ./obj/cost_layer.o ./obj/parser.o ./obj/option_list.o ./obj/darknet.o ./obj/detection_layer.o ./obj/captcha.o ./obj/route_layer.o ./obj/writing.o ./obj/box.o ./obj/nightmare.o ./obj/normalization_layer.o ./obj/avgpool_layer.o ./obj/coco.o ./obj/dice.o ./obj/yolo.o ./obj/detector.o ./obj/layer.o ./obj/compare.o ./obj/classifier.o ./obj/local_layer.o ./obj/swag.o ./obj/shortcut_layer.o ./obj/activation_layer.o ./obj/rnn_layer.o ./obj/gru_layer.o ./obj/rnn.o ./obj/rnn_vid.o ./obj/crnn_layer.o ./obj/demo.o ./obj/tag.o ./obj/cifar.o ./obj/go.o ./obj/batchnorm_layer.o ./obj/art.o ./obj/region_layer.o ./obj/reorg_layer.o ./obj/reorg_old_layer.o ./obj/super.o ./obj/voxel.o ./obj/tree.o ./obj/yolo_layer.o ./obj/gaussian_yolo_layer.o ./obj/upsample_layer.o ./obj/lstm_layer.o ./obj/conv_lstm_layer.o ./obj/scale_channels_layer.o ./obj/sam_layer.o src/yolo_v2_class.cpp -o libdarknet.dll -lm -pthread
src/yolo_v2_class.cpp:1:0: warning: -fPIC ignored for target (all code is position independent) [enabled by default]
#include "darknet.h"
^
src/yolo_v2_class.cpp: In member function ‘std::vector<bbox_t> Detector::tracking_id(std::vector<bbox_t>, bool, int, int)’:
src/yolo_v2_class.cpp:370:42: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (prev_bbox_vec_deque.size() > frames_story) prev_bbox_vec_deque.pop_back();
^
src/yolo_v2_class.cpp:385:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (cur_dist < max_dist && (k.track_id == 0 || dist_vec[m] > cur_dist)) {
^
src/yolo_v2_class.cpp:409:42: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (prev_bbox_vec_deque.size() > frames_story) prev_bbox_vec_deque.pop_back();
^
./obj/http_stream.o:http_stream.cpp:(.text+0x6d): undefined reference to `_imp__shutdown@8'
./obj/http_stream.o:http_stream.cpp:(.text+0xb3): undefined reference to `_imp__recv@16'
./obj/http_stream.o:http_stream.cpp:(.text+0xd9): undefined reference to `_imp__closesocket@4'
./obj/http_stream.o:http_stream.cpp:(.text+0x1c1): undefined reference to `_imp__shutdown@8'
./obj/http_stream.o:http_stream.cpp:(.text+0x3a1): undefined reference to `_imp__shutdown@8'
./obj/http_stream.o:http_stream.cpp:(.text+0x3e7): undefined reference to `_imp__socket@12'
./obj/http_stream.o:http_stream.cpp:(.text+0x40f): undefined reference to `_imp__htons@4'
./obj/http_stream.o:http_stream.cpp:(.text+0x44d): undefined reference to `_imp__setsockopt@20'
./obj/http_stream.o:http_stream.cpp:(.text+0x486): undefined reference to `_imp__ioctlsocket@12'
./obj/http_stream.o:http_stream.cpp:(.text+0x4ba): undefined reference to `_imp__bind@12'
./obj/http_stream.o:http_stream.cpp:(.text+0x4e6): undefined reference to `_imp__listen@8'
./obj/http_stream.o:http_stream.cpp:(.text+0x6a1): undefined reference to `_imp__shutdown@8'
./obj/http_stream.o:http_stream.cpp:(.text+0xae9): undefined reference to `_imp__shutdown@8'
./obj/http_stream.o:http_stream.cpp:(.text+0xb30): undefined reference to `_imp__socket@12'
./obj/http_stream.o:http_stream.cpp:(.text+0xb58): undefined reference to `_imp__htons@4'
./obj/http_stream.o:http_stream.cpp:(.text+0xb96): undefined reference to `_imp__setsockopt@20'
./obj/http_stream.o:http_stream.cpp:(.text+0xbcf): undefined reference to `_imp__ioctlsocket@12'
./obj/http_stream.o:http_stream.cpp:(.text+0xc07): undefined reference to `_imp__bind@12'
./obj/http_stream.o:http_stream.cpp:(.text+0xc33): undefined reference to `_imp__listen@8'
./obj/http_stream.o:http_stream.cpp:(.text+0xde8): undefined reference to `_imp__shutdown@8'
/opt/i686-w64-mingw32/bin/../lib/gcc/i686-w64-mingw32/4.8.5/../../../../i686-w64-mingw32/bin/ld: ./obj/http_stream.o: bad reloc address 0x17 in section `.text.unlikely'
collect2: error: ld returned 1 exit status
make: *** [Makefile:136: libdarknet.dll] Error 1
Yes, it can require libwsock32.a
library LDFLAGS+=-lws2_32
in the Makefile if you use MinGW instead of MSVS/Cygwin on Windows.
So better to use there: https://github.com/AlexeyAB/darknet/blob/10c40551dcadec68050befa6a1cecc6f69049d0d/Makefile#L75
ifeq ($(OS),Windows_NT)
LDFLAGS+=-lws2_32
endif
Ok, in our case uname
returns MSYS_NT-6.*
though.
I managed to build for i686-w64-mingw32
, now we're only missing x86_64-w64-mingw32
:
# make libdarknet.${dlext} LIBNAMESO="libdarknet.${dlext}" LIBSO=1 GPU=0 CUDNN=0 CUDNN_HALF=0 OPENCV=0 DEBUG=0 OPENMP=0 LIBSO=1 ZED_CAMERA=0
gcc -Iinclude/ -I3rdparty/stb/include -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -fPIC -c ./src/gemm.c -o obj/gemm.o
./src/gemm.c:1:0: warning: -fPIC ignored for target (all code is position independent) [enabled by default]
#include "gemm.h"
^
In file included from ./src/gemm.c:519:0:
/opt/x86_64-w64-mingw32/lib/gcc/x86_64-w64-mingw32/4.8.5/include/ammintrin.h:31:3: error: #error "SSE4A instruction set not enabled"
# error "SSE4A instruction set not enabled"
^
compilation terminated due to -Wfatal-errors.
make: *** [Makefile:150: obj/gemm.o] Error 1
Do you use AVX=0 or AVX=1 in Makefile?
I didn't modify the Makfile apart from the LDFLAGS
setting, so the default (shown above) is 0. I tried also with AVX=1
:
# make libdarknet.${dlext} LIBNAMESO="libdarknet.${dlext}" LIBSO=1 GPU=0 CUDNN=0 CUDNN_HALF=0 OPENCV=0 DEBUG=0 OPENMP=0 LIBSO=1 ZED_CAMERA=0 AVX=1
gcc -Iinclude/ -I3rdparty/stb/include -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -ffp-contract=fast -mavx -mavx2 -msse3 -msse4.1 -msse4.2 -msse4a -Ofast -fPIC -c ./src/gemm.c -o obj/gemm.o
./src/gemm.c:1:0: warning: -fPIC ignored for target (all code is position independent) [enabled by default]
#include "gemm.h"
^
./src/gemm.c: In function ‘_castu32_f32’:
./src/gemm.c:534:5: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
return *((float *)&a);
^
./src/gemm.c: In function ‘_mm256_extract_float32’:
./src/gemm.c:538:13: error: request for member ‘m256_f32’ in something not a structure or union
return a.m256_f32[index];
^
compilation terminated due to -Wfatal-errors.
make: *** [Makefile:150: obj/gemm.o] Error 1
Try to change this line: https://github.com/AlexeyAB/darknet/blob/10c40551dcadec68050befa6a1cecc6f69049d0d/src/gemm.c#L515
to this
#if (defined(__AVX__) && defined(__x86_64__)) || (defined(_WIN64) && !defined(__MINGW32__))
Yes, this does the trick for the build without AVX (I'm not sure if @ianshmean needs AVX, though). Thank you very much.
I'll open a PR with the changes we're using.
This is great. Although, given that we're CPU-only for the moment, it would be nice to have AVX enabled, given:
improved performance of detection and training on Intel CPU with AVX (Yolo v3 ~85%, Yolo v2 ~10%)
If we build with AVX, but run the binaries on a non-intel processor would errors occur? Also the same question for GPU, CUDNN, CUDNN_HALF..?
It would be great if we could build a single fully-functional binary that could make use of whatever's on the user's machine
If we build with AVX, but run the binaries on a non-intel processor would errors occur?
AVX1 and AVX2 are supported on both Intel and AMD CPUs.
If you use old Intel/AMD CPU where aren't AVX1/AVX2 then it will work without errors, it will automatically disable AVX in run-time: https://github.com/AlexeyAB/darknet/blob/10c40551dcadec68050befa6a1cecc6f69049d0d/src/gemm.c#L684-L704
The code that is compiled for x86_64 will not work on non-x86_64 CPUs like ARM, so you should compile it with ARM-compiler
Also the same question for GPU, CUDNN, CUDNN_HALF..?
If you compiled it with GPU=1 CUDNN=1 the there should be installed CUDA and cuDNN, and there should be nVidia GPU, otherwise it willnot work.
Ok, how about CUDNN_HALF
? If that binary is run on a non-tesla gpu, would it fail?
I think 3 sets of binaries makes sense:
CUDNN_HALF=1
will be checked in run-time too.
OPENMP=1
OPENMP=1 AVX=1
GPU=1 CUDNN=1 CUDNN_HALF=1
Also I don't know what about OpenCV.
Can you show how Darknet can be used from Julia-language?
Great! Arm can have CUDA, i.e. the Jetson boards, so that can be included in the last group.
With this grouping we can serve every platform with two binary releases:
OPENMP=1
on all, AVX=1
on all except arm, powerpc windowsGPU=1 CUDNN=1 CUDNN_HALF=1
on allAs for how to run it on Julia:
]
to enter pkg modeadd Darknet
to install DarknetFrom there, you can use the examples on the readme here.
For instance:
using Darknet, FileIO
d = "/path/to/weights_and_config_files/"
weightsfile = "yolov3-tiny.weights"
cfgfile = "yolov3-tiny.cfg"
datafile = "coco.data"
imagefile = "/path/to/images/test.jpg"
net = Darknet.load_network(joinpath(d, cfgfile), joinpath(d, weightsfile), 1)
meta = Darknet.get_metadata(joinpath(d, datafile));
img_d = Darknet.load_image_color(imagefile, 0, 0);
results = Darknet.detect(net, meta, img_d, thresh=0.1, nms=0.3)
Currently it's limited to detection only, but all of the exposed methods are wrapped and waiting for convenience functions to be written around them.
Edit: Simplified example
I just wanted to report on some progress that's been made on putting together a Julia (julialang.org) wrapper for this branch of Darknet that is based on pre-build binaries of this branch, and will require no further installation steps than install Julia and type
]add Darknet
.Darknet.jl
This is the current status of this package: https://github.com/ianshmean/Darknet.jl It has two manually built binaries for Linux and MacOS, and convenience functions for running Darknet (not training yet). We're near to releasing support on further platforms, but haven't figured out windows yet.
Cross-compillation of binaries
Darknet.jl is based on pre-compilled binaries, that are auto-compilled using Julia's BinaryBuilder.jl package and CI builder infrastructure. This approach requires no build on the users machine, just an automated download and unpack. We're trying to get cross-compilation working across 13 target platforms, but haven't figured out windows yet. All others seem successful (windows was omitted from this build, but it fails): You can see the latest output of the builder here: https://dev.azure.com/JuliaPackaging/Yggdrasil/_build/results?buildId=224&view=results
Currently we're targeting CPU-only, as a starting place.
You can see the draft cross-compile build script here https://github.com/JuliaPackaging/Yggdrasil/pull/202/files
It would be great to fix windows, if you have any tips on how to modify our existing build script (we've tried to administer some patches). Also, for this work, I'm keen for Darknet to start following semver, so that we can be specific and clear on which version we're building (https://github.com/AlexeyAB/darknet/issues/2671)
cc. @giordano