YoYo000 / BlendedMVS

BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
559 stars 40 forks source link

Blending details #10

Closed decai-chen closed 4 years ago

decai-chen commented 4 years ago

Hi Yao,

thanks for the amazing work!

I find your idea of blending in frequency domain is quite cool. I am currently suffering from the huge domain gap between real and synthetic images for training a stereo matching network. I used Blender to render images or depth. But the model trained on rendered images behaved bad when tested with real images in the same object/scene. So I think your idea of blending could be quite useful for me.

  1. In your paper, you did not mention any other paper about this blending algorithm. Is this a novel idea from you, or can I find more details from other papers?

  2. It would be very nice if you could also provide the code/script about your blending implementation.

谢谢!

YoYo000 commented 4 years ago

Thanks for being interested in our work @decai-chen.

I did not found the same blending procedure used in other works, but I think it is a quite straightforward idea - maybe multi-band blending could be viewed as a related work.

The blending procedure is implemented using the following piece of python code:

import numpy as np
from scipy import fftpack
from scipy.ndimage.morphology import binary_dilation

def gen_gaussian_kernel_frequency(image):
    gaussian_line = np.linspace(0, 750, 750)
    gaussian_line = np.exp(-0.0001 * gaussian_line**2)
    gaussian_kernel_0 = gaussian_line[:, np.newaxis] * gaussian_line[np.newaxis, :]
    gaussian_kernel_1 = np.flip(gaussian_kernel_0, axis=0)
    gaussian_kernel_2 = np.flip(gaussian_kernel_0, axis=1)
    gaussian_kernel_3 = np.flip(gaussian_kernel_2, axis=0)

    h, w, c = image.shape
    fkernel = np.zeros((h, w), 'float32')
    fkernel[0:750, 0:750] = gaussian_kernel_0
    fkernel[h-750:h, 0:750] = gaussian_kernel_1
    fkernel[0:750, w-750:w] = gaussian_kernel_2
    fkernel[h-750:h, w-750:w] = gaussian_kernel_3
    centered = fftpack.fftshift(fkernel)

    return fkernel[:, :, np.newaxis]

def render_origin_blending(original_path, rendered_path, blended_path, rendered_depth_path):
    if os.path.exists(blended_path) and os.path.exists(blended_path.replace('.jpg', '_masked.jpg')):
        return

    print('blending image', rendered_path)

    # read images
    image_1 = plt.imread(original_path)
    if not os.path.exists(rendered_path):
        return
    image_2 = plt.imread(rendered_path)
    image_1 = cv2.resize(image_1, image_2.shape[0:2][::-1])
    depth = plt.imread(rendered_depth_path)

    # high and low pass filter (gaussian)
    low_pass_kernel = gen_gaussian_kernel_frequency(image_1)
    high_pass_kernel = 1 - low_pass_kernel

    # dft
    fimage_1 = fftpack.fft2(image_1, axes=(0, 1))
    fimage_2 = fftpack.fft2(image_2, axes=(0, 1))

    rendered_mask = np.uint8(depth > 0)
    k = np.zeros((3,3),dtype=int); k[1] = 1; k[:,1] = 1 # 8-connected
    boundary_mask = binary_dilation(rendered_mask==0, k) & rendered_mask

    k = np.ones((13, 13), np.uint8)
    boundary_mask = cv2.dilate(boundary_mask, k)
    comb_rendered_mask = 1 - np.clip((1 - rendered_mask) + boundary_mask, 0, 1)
    comb_rendered_mask = comb_rendered_mask[..., np.newaxis]

    # blend
    filtered_fimage_1_l = low_pass_kernel * fimage_1
    filtered_fimage_1_h = high_pass_kernel * fimage_1
    filtered_fimage_2_h = high_pass_kernel * fimage_2

    # inverse dft
    filtered_image_1_l = fftpack.ifft2(filtered_fimage_1_l, axes=(0, 1))
    filtered_image_1_h = fftpack.ifft2(filtered_fimage_1_h, axes=(0, 1))
    filtered_image_2_h = fftpack.ifft2(filtered_fimage_2_h, axes=(0, 1))

    blended_image = filtered_image_1_l + (1 - comb_rendered_mask) * filtered_image_1_h + comb_rendered_mask * filtered_image_2_h
    blended_image = np.uint8(np.clip(np.absolute(blended_image), 0, 255))

    blended_image_masked = (filtered_image_1_l + filtered_image_2_h) * rendered_mask[..., np.newaxis]
    blended_image_masked  = np.uint8(np.clip(np.absolute(blended_image_masked), 0, 255))

    plt.imsave(blended_path, blended_image)
    plt.imsave(blended_path.replace('.jpg', '_masked.jpg'), blended_image_masked)
decai-chen commented 4 years ago

Thank you very much for sharing the blending code! I think your blending idea is promising in solving the domain gap between real and synthetic datasets :D