RDFNet:RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation
This is the implementation of the models and test code for the "RDFNet:RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation", ICCV2017.
File description
- caffe-master: caffe used in our experiments
- test.py: demo code
- Each of NYU-50 / NYU-101 / NYU-152 directory includes RDF model and its prototxt corresponding to different number of resnet layers. (*You may need to change the 'nyud_dir' parameter in the prototxt.)
- data: test data
- nyud_layers.py: input python layer
- gupta-utils-HHA: HHA generation utils by Gupta et al. [2]
Usage
- Install Opencv
- Compile pycaffe: modify the "Makefile.config" in caffe-master for your environment.
- Download the model files.
- Run test.py
- Change 'caffe_root'
- Set the 'scale' and 'model' to test.
- To achieve the same accuracy reported in our paper, you need to implement multi-scale (0.6~1.2) ensemble as described in the paper.
Environment
Our experiments were mainly performed on Ubuntu 14.04 with CUDA7.0 / CUDNNv4 / Titan X (maxwell) / Opencv2.7
Note
- Similarly to RefineNet,
- Our implementation uses bicubic resize function to resize feature map.
- We remove white boundaries of the images in NYUDv2.
- Any comment for improvement is welcome as the code is not fully optimized. but please note that further maintenance will be infrequently performed.
- OOM may occur for RDF-152 with the image scale larger than 1.0 on different environtment (e.g., Titan Xp, CUDA 8.0, CUDNN v6)
Citation
- We would like to thank Guosheng Lin [3] for invaluable help.
[1] @InProceedings{Park_2017_ICCV,
author = {Park, Seong-Jin and Hong, Ki-Sang and Lee, Seungyong},
title = {RDFNet: RGB-D Multi-Level Residual Feature Fusion for Indoor Semantic Segmentation},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}
[2] @incollection{guptaECCV14,
author = {Saurabh Gupta and Ross Girshick and Pablo Arbelaez and Jitendra Malik},
title = {Learning Rich Features from {RGB-D} Images for Object Detection and Segmentation},
booktitle = ECCV,
year = {2014},
}
[3] @inproceedings{lin2017refinenet,
title={Refinenet: Multi-path refinement networks for high-resolution semantic segmentation},
author={Lin, Guosheng and Milan, Anton and Shen, Chunhua and Reid, Ian},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2017}
}
License
For academic usage, the code is released under the permissive BSD license. For any commercial purpose, please contact the authors.