AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.72k stars 7.96k forks source link

Issues with detection on grayscale images (one channel) #2525

Open rniebecker opened 5 years ago

rniebecker commented 5 years ago

I compiled darknet (latest version) with OpenCV (3.3.0) support and I'm training a custom YoloV3 from scratch with channels set to 1.

I'm now working on implementing support for grayscale into my c++ app using yolo_v2_class.hpp. I'm using a GStreamer pipeline to receive frames from an RTSP server, convert them into a cv::Mat in CV_8UC3 format and then I use cv::cvtColor(f, gray, cv::COLOR_BGR2GRAY) to create a grayscale Mat with 1 channel.

If I now try to run detection by passing the cv::Mat to detect(cv::Mat, float, bool), I get an error message from OpenCV because detect() calls mat_to_image() which tries to do a color conversion COLOR_RGB2BGR which complains about the missing channels, since I only have one channel.

Since OpenCV is internally using BGR format I'm not sure this color conversion makes any sense at all...

Anyways I can simply remove the conversion which addresses the OpenCV issue but then the detection call is crashing my app. Unfortunately I can't debug this properly on Windows with MSVC to find the root cause.

As a workaround I'm currently converting the image into grayscale and then back into a 3 channel image which works but is not really an elegant solution.

Any help would be appreciated!

Cheers, Ralf

rniebecker commented 5 years ago

Just a follow up question, the color conversion mentioned above, does that make sense?

Does image_t require RGB or BGR format to work properly?

If it requires BGR I think the conversion is not necessary and you end up feeding RGB since COLOR_RGB2BGR swaps the R and B channels and the default format of OpenCV is BGR.

Cheers, Ralf

AlexeyAB commented 5 years ago

@rniebecker COLOR_RGB2BGR is required, because Darknet and OpenCV use different formats.

Just change this line: https://github.com/AlexeyAB/darknet/blob/b751bac17505a742f149ada81d75689b5e692cde/include/yolo_v2_class.hpp#L118 to these

if(img_src.channels() == 3) {
    cv::cvtColor(img_src, img, cv::COLOR_RGB2BGR);
}

Also add line to check that you send cv::Mat with only 1 channel: std::cout << "\n Mat channels = " << img_src.channels() << std::endl;

rniebecker commented 5 years ago

Hey Alexey,

I've already changed that but it just moves the crash somewhere into the yolo_cpp_dll which I can't debug on Windows with MSVC.

Cheers, Ralf

jordanlui commented 5 years ago

@rniebecker Could you provide any guidance/detail about training yolov3 with 1-channel images? When I followed a yolov3 training tutorial for 3-channel images and put in my 1-channel images, my model seemed to be fitting my training data with learning_rate=0.0001 and error was decreasing for each step of a 300 step training session. However when I modify my .cfg file with channels=1, I cannot find training parameters to successfully train model without everything going to NaNs.

I've been following a learnopencv.com tutorial and everything has made sense except for how to handle these grayscale images :|

If you have any guidance I'd be very grateful!