Implement fast-nms - Githubissues

fsx950223 commented 4 years ago

Describe the feature and the current behavior/state. The fast-nms algorithm has better performance than traditional nms. Described in paper. Kernel implementation may provides better performance than python implementation. Relevant information

Are you willing to contribute it (yes/no): yes
Are you willing to maintain it going forward? (yes/no): yes
Is there a relevant academic paper? (if so, where): https://arxiv.org/pdf/1904.02689.pdf
Is there already an implementation in another framework? (if so, where): https://github.com/dbolya/yolact/blob/1722387d75210361c1f21c911d0e2420a48c7a23/layers/functions/detection.py#L136
Was it part of tf.contrib? (if so, where): no

Which API type would this fall under (layer, metric, optimizer, etc.)

Who will benefit with this feature?

Any other info. Tensorflow Implementation on colab

WindQAQ commented 4 years ago

Would like to see how fast it is compared with tf.image.non_max_suppression.

SSaishruthi commented 4 years ago

Is anyone working on this? If not, I will take a look @seanpmorgan

seanpmorgan commented 4 years ago

Is anyone working on this? If not, I will take a look @seanpmorgan

Not that I'm aware of, but @fsx950223 said they are willing to implement. If there has been no progress then should be fine to start on.

fsx950223 commented 4 years ago

I'm a little busy for now, I will try it next week. Thanks.

SSaishruthi commented 4 years ago

I will wait for the next update on this.

samikama commented 4 years ago

@fsx950223 how does this algorithm compare against existing GPU NMS implementations in TF? Do you have any idea?

fsx950223 commented 4 years ago

@fsx950223 how does this algorithm compare against existing GPU NMS implementations in TF? Do you have any idea?

Tensorflow benchmark could compare performance between them.

bhack commented 4 years ago

We have an upstream fast nms kernel in Tensorflow Lite https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/detection_postprocess.cc#L456-L463. @seanpmorgan I think this could be more in the "perimeter" of keras-cv (/cc @tanzhenyu ) as yesterday we have closed the other image processing related issues/PRs.

anshkumar commented 2 years ago

@bhack I'm not sure how fast nms kernel in Tensorflow Lite will work in normal scenario. Anyway, it's not hard to implement fast-nms.

def _area(boxlist, scope=None):
    # https://github.com/tensorflow/models/blob/831281cedfc8a4a0ad7c0c37173
    # 963fafb99da37/official/vision/detection/utils/object_detection/
    # box_list_ops.py#L48

    """Computes area of boxes.
    Args:
    boxlist: BoxList holding N boxes
    scope: name scope.
    Returns:
    a tensor with shape [N] representing box areas.
    """
    y_min, x_min, y_max, x_max = tf.split(
        value=boxlist, num_or_size_splits=4, axis=1)
    return tf.squeeze((y_max - y_min) * (x_max - x_min), [1])

def _intersection(boxlist1, boxlist2, scope=None):
    # https://github.com/tensorflow/models/blob/831281cedfc8a4a0ad7c0c37173
    # 963fafb99da37/official/vision/detection/utils/object_detection/
    # box_list_ops.py#L209

    """Compute pairwise intersection areas between boxes.
    Args:
    boxlist1: BoxList holding N boxes
    boxlist2: BoxList holding M boxes
    scope: name scope.
    Returns:
    a tensor with shape [N, M] representing pairwise intersections
    """
    y_min1, x_min1, y_max1, x_max1 = tf.split(
        value=boxlist1, num_or_size_splits=4, axis=1)
    y_min2, x_min2, y_max2, x_max2 = tf.split(
        value=boxlist2, num_or_size_splits=4, axis=1)
    all_pairs_min_ymax = tf.minimum(y_max1, tf.transpose(y_max2))
    all_pairs_max_ymin = tf.maximum(y_min1, tf.transpose(y_min2))
    intersect_heights = tf.maximum(0.0, 
        all_pairs_min_ymax - all_pairs_max_ymin)
    all_pairs_min_xmax = tf.minimum(x_max1, tf.transpose(x_max2))
    all_pairs_max_xmin = tf.maximum(x_min1, tf.transpose(x_min2))
    intersect_widths = tf.maximum(0.0, 
        all_pairs_min_xmax - all_pairs_max_xmin)
    return intersect_heights * intersect_widths

def _iou(boxlist1, boxlist2, scope=None):
    # https://github.com/tensorflow/models/blob/831281cedfc8a4a0ad7c0c37173
    # 963fafb99da37/official/vision/detection/utils/object_detection/
    # box_list_ops.py#L259

    """Computes pairwise intersection-over-union between box collections.
    Args:
    boxlist1: BoxList holding N boxes
    boxlist2: BoxList holding M boxes
    scope: name scope.
    Returns:
    a tensor with shape [N, M] representing pairwise iou scores.
    """
    intersections = _intersection(boxlist1, boxlist2)
    areas1 = _area(boxlist1)
    areas2 = _area(boxlist2)
    unions = (tf.expand_dims(areas1, 1) + tf.expand_dims(
        areas2, 0) - intersections)
    return tf.where(
        tf.equal(intersections, 0.0),
        tf.zeros_like(intersections), tf.truediv(intersections, unions))

def _cc_fast_nms(self, boxes, masks, scores, iou_threshold:float=0.5, top_k:int=15):
        # Cross Class NMS
        # Collapse all the classes into 1 
        classes = tf.argmax(scores, axis=-1)+1
        scores = tf.reduce_max(scores, axis=-1)
        _, idx = tf.math.top_k(scores, k=tf.math.minimum(top_k, tf.shape(scores)[0]))
        boxes_idx = tf.gather(boxes, idx, axis=0)

        # Compute the pairwise IoU between the boxes
        iou = _iou(boxes_idx, boxes_idx)

        # Zero out the lower triangle of the cosine similarity matrix and diagonal
        iou = tf.linalg.band_part(iou, 0, -1) - tf.linalg.band_part(iou, 0, 0)

        # Now that everything in the diagonal and below is zeroed out, if we take the max
        # of the IoU matrix along the columns, each column will represent the maximum IoU
        # between this element and every element with a higher score than this element.
        iou_max = tf.reduce_max(iou, axis=0)

        # Now just filter out the ones greater than the threshold, i.e., only keep boxes that
        # don't have a higher scoring box that would supress it in normal NMS.
        idx_out = idx[iou_max <= iou_threshold]

        classes = tf.cast(tf.gather_nd(classes, tf.expand_dims(idx_out, axis=-1)), dtype=tf.float32)
        boxes = tf.gather_nd(boxes, tf.expand_dims(idx_out, axis=-1))
        masks = tf.gather_nd(masks, tf.expand_dims(idx_out, axis=-1))
        scores = tf.gather_nd(scores, tf.expand_dims(idx_out, axis=-1))

        return boxes, masks, classes, scores

tensorflow / addons

Implement fast-nms #671