ltkong218 / IFRNet

IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation (CVPR 2022)
MIT License
259 stars 23 forks source link

The generated image sizes have changed(on 1080X1920) #15

Open wlfAI opened 2 years ago

wlfAI commented 2 years ago

image

ltkong218 commented 1 year ago

The input size of IFRNet should be divided by 16. You can first pad the input frames, and then unpad the output frame.

class InputPadder:
    """ Pads images such that dimensions are divisible by divisor """
    def __init__(self, dims, divisor):
        self.ht, self.wd = dims[-2:]
        pad_ht = (((self.ht // divisor) + 1) * divisor - self.ht) % divisor
        pad_wd = (((self.wd // divisor) + 1) * divisor - self.wd) % divisor
        self._pad = [pad_wd//2, pad_wd - pad_wd//2, pad_ht//2, pad_ht - pad_ht//2]

    def pad(self, *inputs):
        return [F.pad(x, self._pad, mode='replicate') for x in inputs]

    def unpad(self,x):
        ht, wd = x.shape[-2:]
        c = [self._pad[2], ht-self._pad[3], self._pad[0], wd-self._pad[1]]
        return x[..., c[0]:c[1], c[2]:c[3]]

I0 = read(I0_path)
I1 = read(I1_path)
I2 = read(I2_path)
I0 = (torch.tensor(I0.transpose(2, 0, 1)).float() / 255.0).unsqueeze(0).to(device)
I1 = (torch.tensor(I1.transpose(2, 0, 1)).float() / 255.0).unsqueeze(0).to(device)
I2 = (torch.tensor(I2.transpose(2, 0, 1)).float() / 255.0).unsqueeze(0).to(device)
padder = InputPadder(I0.shape, 16)
I0, I2 = padder.pad(I0, I2)
embt = torch.tensor(1/2).float().view(1, 1, 1, 1).to(device)

I1_pred = model.inference(I0, I2, embt)
I1_pred = padder.unpad(I1_pred)

You can refer to benchmarks/SNU_FILM.py.