Open Yang507 opened 6 years ago
@AlexeyAB hi
i have the same issue, i am running 3 instances of python http server with yolo dll, nginx load balancer and have
unspecified launch failure
Any recommendations?
When i run single instance it works fine.
Thx.
@dreambit Hi,
test.mp4
file)?Try to open \yolo_cpp_dll.sln
in MSVS -> (right click on project) -> properties -> C/C++ -> Preprocessor -> Preprocessor Definitions, and change here: NDEBUG;
to DEBUG;
Recompile yolo_cpp_dll.dll
and run again.
Then show me screenshot of the full error.
@AlexeyAB
- What CUDA, cuDNN and GPU do you use?
GTX 1080 TI 11gb cuda_10.0.130_411.31_win10 cudnn-10.0-windows10-x64-v7.4.2.24 opencv-3.4.0-vc14_vc15
- Do you get this error immediately or after several detections?
Sometimes right after weights are loaded For darknet_video.py, 3 instances When i run 2 instances it works
Another issue is that then i use darknet.py or darknet_video.py it is very cpu intensive, intel i5 For two darknet_video instances it uses > 90% cpu and less than 10% gpu
When i run darknet detector test
, detection time is ~35ms, while with python >90ms.
I also noticed that in darknet.py is used predict_image = lib.network_predict_image
while dll api is
struct bbox_t {
unsigned int x, y, w, h; // (x,y) - top-left corner, (w, h) - width & height of bounded box
float prob; // confidence - probability that the object was found correctly
unsigned int obj_id; // class of object - from range [0, classes-1]
unsigned int track_id; // tracking id for video (0 - untracked, 1 - inf - tracked object)
unsigned int frames_counter;// counter of frames on which the object was detected
};
class Detector {
public:
Detector(std::string cfg_filename, std::string weight_filename, int gpu_id = 0);
~Detector();
std::vector<bbox_t> detect(std::string image_filename, float thresh = 0.2, bool use_mean = false);
std::vector<bbox_t> detect(image_t img, float thresh = 0.2, bool use_mean = false);
static image_t load_image(std::string image_filename);
static void free_image(image_t m);
#ifdef OPENCV
std::vector<bbox_t> detect(cv::Mat mat, float thresh = 0.2, bool use_mean = false);
std::shared_ptr<image_t> mat_to_image_resize(cv::Mat mat) const;
#endif
};
When i set DEBUG it become extra slow
With NDEBUG
there is not error when DEBUG is set
I cant reproduce error when DEBUG is set
@dreambit
What versions of Darknet do you use? Try to use the latest version of this repository.
I can't reproduce this bug even if there is no DEBUG
, I waited a few minutes:
- Do you get this error immediately or after several detections?
Sometimes right after weights are loaded For darknet_video.py, 3 instances When i run 2 instances it works
Did you run darknet_video.py
with yolo_cpp_dll.dll
compiled with DEBUG
?
The error message looks like it was compiled without DEBUG definition.
I also noticed that in darknet.py is used
predict_image = lib.network_predict_image
while dll api is
I fixed Readme - there are 2 APIs - C API and C++ API: https://github.com/AlexeyAB/darknet#how-to-use-yolo-as-dll-and-so-libraries
@AlexeyAB Thanks for your time, i cloned to new folder, darknet_video works fine with 3 instances, i dont know why, i will also test darknet.py.
Could you explain cpu load? Cpu is used 100% while gpu 20-30%? i5-2300, is possible to take the load off of CPU? looks like cpu is bottleneck here.
Thx
@AlexeyAB Thanks, i am not sure but i think this error occurs then network size is large, 736x736 in my case and instances count 3. Thanks for your help :)
@dreambit
Could you explain cpu load? Cpu is used 100% while gpu 20-30%? i5-2300, is possible to take the load off of CPU? looks like cpu is bottleneck here.
This is an issue of Python example.
Thanks, i am not sure but i think this error occurs then network size is large, 736x736 in my case and instances count 3. Thanks for your help :)
May be just there is no enough GPU-RAM?
@AlexeyAB
This is an issue of Python example.
I am not sure, i made lots of prints of exec time and the most resource-intensive part is predict_image(net, im), which is just lib.network_predict_image call, maybe python problem with dll
May be just there is no enough GPU-RAM?
Maybe , i thought about that, but in this case the error unspecified launch failure
is misleading, because usually when there is not enough memeory - out of memory is thrown
@dreambit
Try to catch this bug with DEBUG
definition.
Thanks, i am not sure but i think this error occurs then network size is large, 736x736 in my case and instances count 3. Thanks for your help :)
There shouldn't be enought GPU-RAM to run 3 instances of yolov3.cfg
with width=736 height=736 batch=1
, because 1 instance occupies ~4.5 GB GPU-RAM even if I run ./darknet detector test...
. So 3 instances require ~13.5 GB GPU-RAM, that is more than 11 GB on GTX 1080 TI.
When i built the libdarknet.so and test my model with the library, i run the code with multithread on single GPU by use the yolo object alone, but the program happened a problem:
so i doubt if the darknet support the multithread with only GPU. i run on the nvidia jetson tx2.