Research computer vision libraries

lewfish commented 6 years ago

We'd like to create a new backend for one of the tasks using a user-friendly deep learning computer vision library. Research https://gluon-cv.mxnet.io/ and https://github.com/chainer/chainercv. To see if this is generally a good idea and help make a decision about which library to use read documentation for the libraries and take some notes (bullet points are good). Especially pay attention to:

installation process
community around the library (issues, prs, etc)
quality of documentation
how to create a custom dataset
how to configure a model/training
how to train a model
how to make predictions
general impression of how flexible and elegant the API is
the variety of models that are supported

lewfish commented 6 years ago

ability to train on cpu and gpu
pretrained models that are available
maturity of the library

lewfish commented 6 years ago

the ability to resume training from a checkpoint
generating log output while training to Tensorboard or similar

nholeman commented 6 years ago

My choice: GluonCV. Here's why:

Comparison

	GluonCV	ChainerCV
Installation	MXNet	numpy, chainer, pillow, cython for extra funcionality: chainerMN, matplotlib, OpenCV, and SciPy
Community	Current version: 0.3.0 Last update: < 1 week ago Popularity: 164 Forks, 896 Stars, 125k downloads No. Contributors: 26 License: Apache-2. active discussion forum	Current version: 0.10.0 Last update: < 1 week ago Popularity: 179 Forks, 832 Stars No. Contributors: 19 License: MIT.
Documentation	Large, better organized than ChainerCV	Large, more in volume than GluonCV
Custom Dataset	Due to MXNet convention, recommended to use LST files, which are plain text list files to store labels. Supports deriving from PASCAL VOC format	ChainerCV Experimental: GetterDataset
Config Model/Training	straight-forward	straight-forward
Training	training done with gluon.utils	training is done directly with Chainer
Make Prediction	Easy	Easy
API Flexibility	Supports image classification, object detection, and semantic segmentation, and instance segmentation	Supports image classification, object detection, semantic segmentation, and instance segmentation ChainerCV's implementation was in an academic setting for a competition. As such, it seems to be less flexible overall than GluonCV, which is more of a developed product.
Model Variety	CIFAR: 10 models ImageNet: 5 models SSD: 11 models FasterRCNN: 3 models YOLOv3: 3 models Mask RCNN: 1 model FCN: 5 models PSPNet: 4 models DeepLabV3: 4 models	Classification: VGG16 ResNet50 ResNet101 ResNet152 Detection: Faster R-CNN SSD300 SSD512 YOLOv2 YOLOv3 Semantic Segmentation: SegNet PSPNet
CPU/GPU	yes	yes
Library Maturity	project began Feb 2018	began 2018
Stop/Resume training	yes	no
Log output	yes	no

Other Thoughts GluonCV: GluonCV’s documentation is more organized and more production-ready than ChainerCV. ChainerCV: Some useful modules in ChainerCV (like creating your own dataset) are WIPs for Chainer and will be removed from ChainerCV when they are fully developed. If any of these modules are used, Raster Vision will have to refactor once they are moved into Chainer.

Conclusion Overall, ChainerCV was easier for me to get up and running. However, as I used these tools more, I became more convinced that GluonCV is better-equipped to be used in a flexible plugin for Raster Vision. GluonCV has better-organized documentation, more avenues for discussion (forum and GitHub issues), and features that won't be deprecated later (like Chainer's WIP modules that are currently ChainerCV features).

lewfish commented 6 years ago

GluonCV it is!

azavea / raster-vision

Research computer vision libraries #471