jac99 / MinkLocMultimodal

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition
MIT License
100 stars 9 forks source link

Oxford Dataset RGB Image Process #23

Closed FudongGe closed 10 months ago

FudongGe commented 10 months ago

Hi, thanks for your great work. I have one question. The stereo/center camera data downloaded from the official website is single-channel, but I haven't seen any code in your implementation for handling single-channel data. Did I overlook something? Thanks in advance.

jac99 commented 10 months ago

Hi, probably you've downloaded the wrong data. Images from the stereo-center camera are 3-channel RGB images. Take a look at this screenshot showing exemplary images from the Bumblebee XB3 stereo camera. It shows three images: left, center and right - and all are three channel RGGB images image

FudongGe commented 10 months ago

Yes, I download the data here. And it's mode is L not RGB. Is it right?

image

jac99 commented 10 months ago

Hi, I forgot, but it turns out that the images are stored in a special format and must be converted to RGB. See this link: https://robotcar-dataset.robots.ox.ac.uk/documentation/ in the section "Images": All images are stored as lossless-compressed PNG files in unrectified 8-bit raw Bayer format. The top left 4 pixel Bayer pattern for the Bumblebee XB3 images is GBRG, and for Grasshopper2 images is RGGB. The raw Bayer images can be converted to RGB using the MATLAB demosaic function, the OpenCV cvtColor function or similar methods.

OxfordRobotCar SDK contains functions to load and decode images into RGB format. See this gitbub repo: https://github.com/ori-mrg/robotcar-dataset-sdk The function load_image in python/image.py file loads and converts the image to RGB format.

My script generate_rgb_for_lidar.py, used to prepare downsampled RGB images for MinkLocMultimodal training, takes care of this and uses load_image function from RobotCar SDK to read and convert images into RGB format. So you don't need to convert the images yourself. Just run the generate_rgb_for_lidar.py script and it'll properly open and convert images and generate downsampled RGB images for MinkLocMultimodal training.

jac99 commented 10 months ago

I've also fixed the link in README to the file with preprocessed and downsampled images from OxfordRobotCar dataset. You can use these images to train and evaluate MinkLocMultimodal method.

https://drive.google.com/file/d/1g6nWvJ-T-M41MyZTa0oRHIwP7G-cxer6/view?usp=sharing

FudongGe commented 10 months ago

Okay, I understand now. Thank you very much for your help!