fizyr / keras-retinanet

Keras implementation of RetinaNet object detection.
Apache License 2.0
4.38k stars 1.96k forks source link

Optimized version of compute_overlap #1555

Closed znorman-harris closed 2 years ago

znorman-harris commented 3 years ago

I converted the compute_overlap function to Python and had major performance issues with it. For others that might be interested, I have a vectorized version of it using array-based operations which is 15-20x faster for input images of size [640, 640, nChannels].

With this updated function, my training went from 2400 seconds per epoch to 120 seconds per epoch!!! I thought there was something wrong with my computer due to how long the training was taking and was quite pleased when I found this routine as the bottleneck.

To make sure that this function operates the same way as the original did, for my dataset I did a numpy.array_equal() for the original `compute_overlap() function and the vectorized function to verify that the outputs are the same,

# about 20x faster than compute_overlap as python function
def compute_overlap_vectorized(boxes, query_boxes):
    """
    Args
        a: (N, 4) ndarray of float
        b: (K, 4) ndarray of float

    Returns
        overlaps: (N, K) ndarray of overlap between boxes and query_boxes
    """

    # get number of boxes
    N = boxes.shape[0]
    K = query_boxes.shape[0]

    # pre-calculate areas of both boxes
    areaN = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
    areaK = (query_boxes[:, 2] - query_boxes[:, 0]) * (query_boxes[:, 3] - query_boxes[:, 1])

    # allocate array of overlaps
    overlaps = np.zeros((N, K), dtype=np.float64)

    # process our comparison boxes, assumption that query_boxes is smaller than boxes
    for k in range(K):
        # check for x overlap
        iw = np.minimum(boxes[:, 2], query_boxes[k, 2]) - np.maximum(boxes[:, 0], query_boxes[k, 0])
        idxW = np.where(iw > 0)[0]

        # only process boxes we overlap wioth
        if len(idxW) > 0:
            # check for y overlap
            ih = np.minimum(boxes[idxW, 3], query_boxes[k, 3]) - np.maximum(boxes[idxW, 1], query_boxes[k, 1])
            idxH = np.where(ih > 0)[0]

            # only process if we have y overlap
            if len(idxH) > 0:
                # print(K, len(idxW), len(idxH))
                idxW = idxW[idxH]
                # subset x and y overlap info
                iw = iw[idxW]
                ih = ih[idxH]

                # get union area
                ua = np.float64(
                    areaN[idxW] +
                    areaK[k] - iw * ih
                )

                # calculate overlaps and update
                overlaps[idxW, k] = iw * ih / ua

    return overlaps
stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to the lack of recent activity. It will be closed if no further activity occurs. Thank you for your contributions.