Open cooperlab opened 3 years ago
This can be achieved using code similar to _roialign_coords from the roialign subpackage.
I suggest this be added to the anchor subpackage as an 'upsample' function. For a given 3D tensor containing feature maps and upsampling factor (e.g. 2), first generate coordinates between anchor points, then use tfa interpolation to generate new feature maps at this location.
The math where receptive field size is used needs to be examined carefully when doing this to make sure we are not violating assumptions. Currently we assume that the field size and anchor spacing are the same. We might need to add an extra argument or member to the model to distinguish between these two.
Need to think about how to handle anchors at the edges. Suppose you have a 2 x 2 field size, and so your top left anchor is centered at position (1, 1) (zero indexed). If you upsample 2X then you create 4 feature maps from this new anchor, with centers (0.5, 0.5), (1.5, 0.5), (0.5, 1.5), (1.5, 1.5). Since the anchors cannot cross the image boundary, you have to discard the first three anchors. These are technically extrapolated anyway, and so tfa interpolate will just repeat the original feature map values here.
Upsampling the feature map tensor is effectively changing anchor spacing which if a function of the receptive field size (the ‘field’ variable is used throughout the code). ‘field’ could represent either the actual field size, or the anchor spacing, and so we have to treat this carefully.
The field member appears in several functions:
/models/faster_rcnn.py -> map_outputs: ‘field’ is used to calculate the locations of each anchor in the RPN outputs (classification, regression), moving from the feature map where each index represents a field back to image space where each index is a pixel. Here ‘field’ is used both as field size and spacing.
/anchors/create.py -> _anchor_corners: ‘field’ here is used to represent field size and not just spacing. For a given anchor size this function determines which anchors are entirely contained within the image, and which receptive fields they correspond to. Do not call when ‘field’ represents spacing.
/anchors/create.py -> _anchors_at_size: Calls _anchor_corners. Do not call when ‘field’ represents anchor spacing.
/anchors/create.py -> create_anchors: Calls _anchors_at_size. Do not call when ‘field’ represents anchor spacing.
Two ways to approach this problem:
Create anchor position variables as currently implemented using the receptive field size (do not modify the anchor generating functions). Generate the initial feature map, interpolate it, and then modify the anchor variables and the ‘field’ class member based on the upsampling. Modify map_outputs to accommodate a separate variable that represents anchor spacing.
Introduce a new member ‘spacing’ to FasterRCNN to handle the current different interpretations of ‘field’, using ‘spacing’ where we mean anchor spacing, and using ‘field’ where we mean field size. Generate the feature maps first and then interpolate them. Generate the anchors in a following step instead of modifying them. Modify the anchor generating functions as well as map_outputs.
I have spent the last two days looking at the code of roialign
and create_anchors
and I had some interesting discoveries.
As far as I understand, each feature vector created by the backbone net is one anchor. This means that create anchors should create the same number of anchors as feature vectors. This does not hold true (for an input image 224 x 224 x 3):
roialign
:
Problems:
-field/2
is incorrect and should be + field/2. this would mean the first and last anchor are missing, which could make sense because the anchors might have crossed the image border (the first anchor should be located at (8, 8) and has a width of 32 so it would have been filtered out if generated)I think this math is correct for the current case where field == feature map spacing.
We need to add roi_align to the list of functions above that need to be modified to incorporate spacing != field.
I implemented a resizing function to increase object size in #41. I added this to the lymphoma Jupyter example and there was a significant performance increase.
Can be addressed by applying upsample layer to backbone feature map output tensor.
Add functionality to increase the density of anchors by interpolating feature maps generated by backbone.