PyImageSearch / imutils

A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.
MIT License
4.51k stars 1.03k forks source link

perspective.four_point_transform() doesn't always work #221

Open haseeb33 opened 3 years ago

haseeb33 commented 3 years ago

I was using this method to extract images after applying Mask-RCNN and found that it doesn't work for all the cases. Just a use case [(1604,934), (1693,797), (1693,923), (1627,1095)].

perspective.four_point_transform() calls order_points(pts) and it's implementation is mentioned below with above mentioned use case.

from scipy.spatial import distance as dust
pts = np.array([(1604,934), (1627,1095), (1693,797), (1693,923)])
xSorted = pts[np.argsort(pts[:, 0]), :]

leftMost = xSorted[:2, :]
rightMost = xSorted[2:, :]

leftMost = leftMost[np.argsort(leftMost[:, 1]), :]
(tl, bl) = leftMost

 D = dist.cdist(tl[np.newaxis], rightMost, "euclidean")[0]
(br, tr) = rightMost[np.argsort(D)[::-1], :]

np.array([tl, tr, br, bl], dtype="float32")

Output of this code(order_points(pts)) is:

array([[1604.,  934.], [1693.,  923.], [1693.,  797.], [1627., 1095.]], dtype=float32)

Shouldn't it be [tl, tr, br, bl] == [1604., 934.], [1627., 1095.], [1693., 923.], [1693., 797.]

Please correct me if I am wrong.

farqis commented 1 year ago

@haseeb33, That is right. And the culprit is the line D = dist.cdist()[0] and the line below it. Euclidean distance is a really bad predictor for "guessing" the opposite corner. A more robust algorithm can be created to determine the points more accurately. However no algorithm can resolve the issue 100% due to more unknowns than equations / data points.

For everyone's benefit, following is what I use instead of Euclidean distance. It fixes the corner assignments that are clearly wrong, while others that simply result in the image being transposed can be fixed in post processing step for each use-case.

imutils seems to be poorly maintained. With years of pending PRs I don't see a point raising another one to fix the issue.

Here's my code. Hope this helps you and anyone else looking for the same.

def order_points(pts):
    xSorted = pts[np.argsort(pts[:, 0]), :]

    leftMost = xSorted[:2, :]
    rightMost = xSorted[2:, :]

    leftMost = leftMost[np.argsort(leftMost[:, 1]), :]
    (tl, bl) = leftMost

    # The change starts here...
    rightMost = rightMost[np.argsort(rightMost[:, 1]), :]
    (tr, br) = rightMost
    interPt = lineIntersection(np.reshape([tl,bl], 4), np.reshape([tr,br], 4))

    is_online_l, dl1, dl2 = isPointOnLineSegment(interPt, np.array(tl), np.array(bl))
    is_online_r, dr1, dr2 = isPointOnLineSegment(interPt, np.array(tr), np.array(br))

    if (is_online_l and is_online_r):
      (tr,br) = (br,tr) # Swap the corners to avoid clearly wrong config.

    # Can use "dlx" and possibly line slopes to make point=corner assignment more accurate
    # But not needed right now. The above method is good enough for most use-cases.

    return np.array([tl, tr, br, bl], dtype="float32")

def almostEqual(number1, number2):
    EPSILON = 1.0 * 10**(-5)
    if (abs(number1 - number2) <= (EPSILON * np.max([1.0, abs(number1), abs(number2)]))):
        return True
    return False

def lineIntersection(line1, line2):
    x = 0
    y = 0
    x1, y1, x2, y2 = line1
    x3, y3, x4, y4 = line2

    A1 = y2 - y1
    B1 = x1 - x2
    C1 = (x1 * A1) + (y1 * B1)

    A2 = y4 - y3
    B2 = x3 - x4
    C2 = (x3 * A2) + (y3 * B2)

    det = (A1 * B2) - (A2 * B1)

    if (not almostEqual(det, 0)):
        x = (C1 * B2 - C2 * B1) / (det)
        y = (C2 * A1 - C1 * A2) / (det)

    return [x, y]

def isPointOnLineSegment(point, lineSegmentStart, lineSegmentEnd):
    d1 = distanceBtwPoints(point, lineSegmentStart)
    d2 = distanceBtwPoints(point, lineSegmentEnd)
    lineSegmentLength = distanceBtwPoints(lineSegmentStart, lineSegmentEnd)

    return almostEqual(d1 + d2, lineSegmentLength), d1, d2

def distanceBtwPoints(p1, p2):
    return np.linalg.norm(p2-p1)