vt-vl-lab / FGVC

[ECCV 2020] Flow-edge Guided Video Completion
Other
1.55k stars 263 forks source link

The implementation of the homography warp before optical flow calculation #55

Closed veizgyauzgyauz closed 2 years ago

veizgyauzgyauz commented 3 years ago

Thanks for sharing such a great job! As said in the paper, a homography warp, which is estimated using RANSAC on ORB feature matches, is used before calculating the optical flow. I'm wondering where the implementation of this operation locates since I can't find it in the code.

brunomsantiago commented 3 years ago

I am not part of the project but I've read the code a few times. The best function to follow what happens under the hood is video_completion_seamless() on video_completion.py.

In this function the first thing they do with the frames is calculating the flow, so they probably meant "before using the optical flow" instead of "before calculating the optical flow". There is a part of this same function where they calculate some gradients (see line 468 to line 492 of video_completion.py). I don't know much about RANSAC or ORB, but it seems to be feature matching.

veizgyauzgyauz commented 3 years ago

Thanks for your reply! I've checked the lines you pointed out. But it seems that the author just applies cv2.inpaint for video frame inpainting and calculates gradients of the inpainted frames. The gradients are used for temporal neighbor fusing in Sec. 3.4. I guess the author inpaints the video frames first because it contributes to getting better gradients. But those operations have nothing to do with estimating an aligning homography matrix using RANSAC.

gaochen315 commented 3 years ago

Hi @veizgyauzgyauz, I didn't provide the code because the current OpenCV library does not support feature extraction anymore due to copyright issues. Here is the code if you are interested.

def detectAndDescribe(image):
    # convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # check to see if we are using OpenCV 3.X
    if imutils.is_cv3(or_better=True):

        # detect and extract features from the image
        descriptor = cv2.xfeatures2d.SURF_create()
        (kps, features) = descriptor.detectAndCompute(image, None)

        # orb feature is way faster
        # orb = cv2.ORB_create()
        # kp = orb.detect(gray, None)
        # (kps, features) = orb.compute(gray, kp)

    # otherwise, we are using OpenCV 2.4.X
    else:
        # detect keypoints in the image
        detector = cv2.FeatureDetector_create("SIFT")
        kps = detector.detect(gray)

        # extract features from the image
        extractor = cv2.DescriptorExtractor_create("SIFT")
        (kps, features) = extractor.compute(gray, kps)

    # convert the keypoints from KeyPoint objects to NumPy
    # arrays
    kps = np.float32([kp.pt for kp in kps])

    # return a tuple of keypoints and features
    return (kps, features)

def matchKeypoints(kpsA, kpsB, featuresA, featuresB, ratio=0.75, reprojThresh=4.0):
    # compute the raw matches and initialize the list of actual
    # matches
    matcher = cv2.DescriptorMatcher_create("BruteForce")
    rawMatches = matcher.knnMatch(featuresA, featuresB, 2)
    matches = []

    # loop over the raw matches
    for m in rawMatches:
        # ensure the distance is within a certain ratio of each
        # other (i.e. Lowe's ratio test)
        if len(m) == 2 and m[0].distance < m[1].distance * ratio:
            matches.append((m[0].trainIdx, m[0].queryIdx))

    # computing a homography requires at least 4 matches
    if len(matches) > 4:
        # construct the two sets of points
        ptsA = np.float32([kpsA[i] for (_, i) in matches])
        ptsB = np.float32([kpsB[i] for (i, _) in matches])

        # compute the homography between the two sets of points
        (H, status) = cv2.findHomography(ptsA, ptsB, cv2.RANSAC, reprojThresh)

        # return the matches along with the homograpy matrix
        # and status of each matched point
        return (matches, H, status)

    # otherwise, no homograpy could be computed
    return None

def getimage(img1_path, img2_path, size = None):
    frame1 = cv2.imread(img1_path)#[:, :, ::-1]
    frame2 = cv2.imread(img2_path)#[:, :, ::-1]

    if size != None:
        frame1 = cv2.resize(frame1, (size[1], size[0]))
        frame2 = cv2.resize(frame2, (size[1], size[0]))

    imgH, imgW, _ = frame1.shape

    (kpsA, featuresA) = detectAndDescribe(frame1)
    (kpsB, featuresB) = detectAndDescribe(frame2)
    try:
        (_, H_BA, _) = matchKeypoints(kpsB, kpsA, featuresB, featuresA)
    except:
        H_BA = np.array([1.0,0,0,0,1.0,0,0,0,1.0]).reshape(3,3)

    NoneType = type(None)
    if type(H_BA) == NoneType:
        H_BA = np.array([1.0,0,0,0,1.0,0,0,0,1.0]).reshape(3,3)

    try:
        tmp = np.linalg.inv(H_BA)
    except:
        H_BA = np.array([1.0,0,0,0,1.0,0,0,0,1.0]).reshape(3,3)

    img2_registered = cv2.warpPerspective(frame2, H_BA, (imgW, imgH))
    frame1_tensor = torch.from_numpy(frame1).permute(2, 0, 1).contiguous().float()
    frame2_tensor = torch.from_numpy(frame2).permute(2, 0, 1).contiguous().float()
    frame2_reg_tensor = torch.from_numpy(img2_registered).permute(2, 0, 1).contiguous().float()

    return frame1_tensor, frame2_tensor, frame2_reg_tensor, H_BA

def infer(args, Flownet, device, img1_name, img2_name, size = None):

    img1, img2, img2_reg, H_BA = getimage(img1_name, img2_name, size)
    _, imgH, imgW = img1.shape
    img1 = img1[None, :, :]
    img2 = img2[None, :, :]
    img2_reg = img2_reg[None, :, :]
    img1 = img1.to(device)
    img2 = img2.to(device)
    img2_reg = img2_reg.to(device)
    if args.homography != True:
        flow = Flownet(img1, img2)[0].permute(1, 2, 0).data.cpu().numpy()
    else:
        flow = Flownet(img1, img2_reg)[0].permute(1, 2, 0).data.cpu().numpy()

        (fy, fx) = np.mgrid[0 : imgH, 0 : imgW].astype(np.float32)

        fxx = copy.deepcopy(fx) + flow[:, :, 0]
        fyy = copy.deepcopy(fy) + flow[:, :, 1]

        (fxxx, fyyy, fz) = np.linalg.inv(H_BA).dot(np.concatenate((fxx.reshape(1, -1),
                                                   fyy.reshape(1, -1),
                                                   np.ones_like(fyy).reshape(1, -1)), axis=0))
        fxxx, fyyy = fxxx / fz, fyyy / fz

        flow = np.concatenate((fxxx.reshape(imgH, imgW, 1) - fx.reshape(imgH, imgW, 1),
                               fyyy.reshape(imgH, imgW, 1) - fy.reshape(imgH, imgW, 1)), axis=2)

    return flow
kpbhat25 commented 8 months ago

if i know the optical flow between two images , can i estimate the homography warp ?