MatConvNet-Calvin v1.0
MatConvNet-Calvin is a wrapper around MatConvNet that (re-)implements
several state of-the-art papers in object detection and semantic segmentation. This includes our own work "Region-based semantic segmentation with end-to-end training" [5]. Calvin is a Computer Vision research group at the University of Edinburgh (http://calvin.inf.ed.ac.uk/). Copyrights by Holger Caesar and Jasper Uijlings, 2015-2016.
Overview
Methods
- Fast R-CNN (FRCN) [1]: State-of-the-art object detection method. The original code was implemented for Caffe. This reimplementation ports it to MatConvNet by adding region of interest pooling and a simplified version of bounding box regression.
- Fully Convolutional Networks (FCN) [2]: Very successful semantic segmentation method that builds the basis for many modern semantic segmentation methods. FCNs operate directly on image pixels, performing a series of convolutional, fully connected and deconvolutional filters. This implementation is based on MatConvNet-FCN and is modified to work with arbitrary datasets.
- Multi-Class Multipe Instance Learning [3]: Extends FCNs for weakly supervised semantic segmentation. We also implement the improved loss function of "What's the point" [4], which takes into account label presence and absence.
- Region-based semantic segmentation with end-to-end training (E2S2) [5]: State-of-the-art semantic segmentation method that brings together the advantages of region-based methods and end-to-end trainable FCNs. This implementation is based on our implementation of Fast R-CNN and adds the free-form region of interest pooling and region-to-pixel layers.
Dependencies
Installation
- Install Matlab R2015a (or newer) and Git
- Clone the repository and its submodules from your shell
git clone https://github.com/nightrome/matconvnet-calvin.git
cd matconvnet-calvin
git submodule update --init
- Execute the following Matlab commands
- Setup MatConvNet
cd matconvnet/matlab; vl_compilenn('EnableGpu', true); cd ../..;
- Setup MatConvNet-Calvin
cd matconvnet-calvin/matlab; vl_compilenn_calvin(); cd ../..;
- Add files to Matlab path
setup();
- (Optional) Download pretrained models:
- FRCN:
downloadModel('frcn');
- FCN:
downloadModel('fcn');
- E2S2 (Full):
downloadModel('e2s2_full');
- E2S2 (Fast):
downloadModel('e2s2_fast');
Instructions
1) FRCN
- Usage: Run
demo_frcn()
- What: This script trains and tests Fast R-CNN using VGG-16 for object detection on PASCAL VOC 2010. The parametrization of the regressed bounding boxes is slightly simplified, but we found this to make no difference in performance.
- Model: Training this model takes about 8h on a Titan X GPU. If you just want to use it you can download the pretrained model in the installation step above. Then run the demo to see the test results.
- Results: If the program executes correctly, it will print the per-class results in average precision and their mean (mAP) for each of the 20 classes in PASCAL VOC. The example model achieves 63.5% mAP on the validation set using no external training data.
- Note: The results vary due to the random order of images presented during training. To reproduce the above results we fix the initial seed of the random number generator.
2) FCN
3) E2S2
Training for different datasets
- The FCN and E2S2 code can be easily trained for different datasets.
- Create your own dataset class "MyDataset", e.g. by copying from SiftFlowDatasetMC. The labelCount field should correspond to all classes (incl. background if it exists in your dataset). Note that it has to inherit from the DatasetMC class. It has all the relevant methods: getImage(), getImLabelMap(), etc.
- FCN: Change the dataset in demo_fcn.m.
- E2S2: Modify setupE2S2Regions.m and e2s2_wrapper_SiftFlow_fast.m.
References
- [1] Fast R-CNN (FRCN) by Girshick et al., ICCV 2015, http://arxiv.org/abs/1504.08083
- [2] Fully Convolutional Networks for Semantic Segmentation (FCN) by Long et al., CVPR 2015, http://arxiv.org/abs/1411.4038
- [3] Fully Convolutional Multi-Class Multipe Instance Learning by Pathak et al., ICLR 2015 workshop, http://arxiv.org/abs/1412.7144
- [4] What's the point: Semantic segmentation with point supervision by Bearman et al., ECCV 2016, http://arxiv.org/abs/1506.02106
- [5] Region-based semantic segmentation with end-to-end training (E2S2) by Caesar et al., ECCV 2016, http://arxiv.org/abs/1607.07671
Disclaimer
Except for [5], none of the methods implemented in MatConvNet-Calvin is authorized by the original authors. These are (possibly simplified) reimplementations of parts of the described methods and they might vary in terms of performance. This software is covered by the FreeBSD License. See LICENSE.MD for more details.
Contact
If you run into any problems with this code, please submit a bug report on the Github site of the project. For other inquiries contact holger-at-it-caesar.com.