Using darknet with RGB-D images

milagorecki commented 5 years ago

Hi,

I would like to train an object detector on RGB-D images and therefore extend YOLO/darknet to accept 4-channel images for this purpose. So far I struggle to load 4-channel images and feed them into the network. Main Idea: Stack the images so that the network receives 4-channel-images (R,G,B,D) as input as opposed to extracting features of RGB and D separately and fusing them later in the network.

There is the option in the network configuration files to set the number of channels. However, I got the impression that this number does not change the input the network expects. (If I set channels=4, the parsed network seems to have 4 input channels, but I can still train it with 3-channel images without any errors.) What does channels actually do and how does it relate to the expected input?
When using OpenCV load_image(filename, w,h,c) in image.c calls load_image_cv(filename, w,h,c), which can only handle images with 0,1 or 3 channels. So I guess that's where I have to do some modifications. (Any tips for that?) However, I found that load_image is always called by load_image_color (via load_data from data.c), which does so with a fixed c: load_image(filename, w,h,3). How can I change this more dynamically, maybe by using the number of channels given in the config file?

If useful, this is what the top of my Makefile looks like:

GPU=1 CUDNN=1 OPENCV=1 OPENMP=0 DEBUG=0

If anyone has already extended darknet to include depth or has any tips, what I could/should try, I would be really grateful! :)

EmileTestUser commented 3 years ago

Hi @milagorecki ,

I am in need of a 4-channel yolo implementation as well. Could you get any solution for this?

Greetings

ShivamPatel-meng commented 3 years ago

Hi @EmileTestUser,

I am also looking for an RGB-depth Yolo darknet implementation. Can anyone help me on how to edit make file?

thanks

pjreddie / darknet

Using darknet with RGB-D images #1764