robmarkcole / fire-detection-from-images

Detect fire in images using neural nets
MIT License
379 stars 75 forks source link

Investigate anchor boxes #13

Open robmarkcole opened 3 years ago

robmarkcole commented 3 years ago

Raised by Farid on discord: Try changing the anchor boxes sizes to test if it can improv model accuracy

Some takeaways: Improving Anchor Box Configuration As a general rule, you should ask yourself the following questions about your dataset before diving into training your model:

Code

import torchvision
from torchvision.models.detection import FasterRCNN
from torchvision.models.detection.rpn import AnchorGenerator

# load a pre-trained model for classification and return
# only the features
backbone = torchvision.models.mobilenet_v2(pretrained=True).features
# FasterRCNN needs to know the number of
# output channels in a backbone. For mobilenet_v2, it's 1280
# so we need to add it here
backbone.out_channels = 1280

# let's make the RPN generate 5 x 3 anchors per spatial
# location, with 5 different sizes and 3 different aspect
# ratios. We have a Tuple[Tuple[int]] because each feature
# map could potentially have different sizes and
# aspect ratios
anchor_generator = AnchorGenerator(sizes=((32, 64, 128, 256, 512),),
                                   aspect_ratios=((0.5, 1.0, 2.0),))

# let's define what are the feature maps that we will
# use to perform the region of interest cropping, as well as
# the size of the crop after rescaling.
# if your backbone returns a Tensor, featmap_names is expected to
# be [0]. More generally, the backbone should return an
# OrderedDict[Tensor], and in featmap_names you can choose which
# feature maps to use.
roi_pooler = torchvision.ops.MultiScaleRoIAlign(featmap_names=[0],
                                                output_size=7,
                                                sampling_ratio=2)

# put the pieces together inside a FasterRCNN model
model = FasterRCNN(backbone,
                   num_classes=2,
                   rpn_anchor_generator=anchor_generator,
                   box_roi_pool=roi_pooler)

You can pass anchor_generator to Icevision models: Those arguments are passed to torchvision arguments (kwargs) What you need is a code similar to this part. You have to choose your sizes and aspect_ratios that make sense to your use-case:

# let's make the RPN generate 5 x 3 anchors per spatial
# location, with 5 different sizes and 3 different aspect
# ratios. We have a Tuple[Tuple[int]] because each feature
# map could potentially have different sizes and
# aspect ratios
anchor_generator = AnchorGenerator(sizes=((32, 64, 128, 256, 512),),
                                   aspect_ratios=((0.5, 1.0, 2.0),))

and pass anchor_generator to Icevision faster_rcnn model.

NOTE: cannot be used with efficientdet anchor_generator

robmarkcole commented 3 years ago

Attempt 1, am guessing and getting errors

https://github.com/robmarkcole/fire-detection-from-images/blob/master/pytorch/icevision/icevision_firenet_faster_rcnn_anchors.ipynb