hzwer / Practical-RIFE

We are developing more practical frame interpolation approach.
MIT License
547 stars 60 forks source link

RIFE 4.0 causes tensor size mismatch errors for some resolutions #6

Closed n00mkrad closed 2 years ago

n00mkrad commented 2 years ago

Interpolating 720p or 1440p video with RIFE 4.0 throws an error after some frames:

File "D:\Code\GitHub\flowframes\Code\bin\x64\Release\FlowframesData\pkgs\rife-cuda\arch\RIFE_HDv3.py", line 59, in inference
flow, mask, merged = self.flownet(imgs, timestep, scale_list)
File "D:\Software\Python38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Code\GitHub\flowframes\Code\bin\x64\Release\FlowframesData\pkgs\rife-cuda\arch\IFNet_HDv3.py", line 99, in forward
f0, m0 = block[i](torch.cat((warped_img0[:, :3], warped_img1[:, :3], timestep, mask), 1), flow, scale=scale_list[i])
RuntimeError: Sizes of tensors must match except in dimension 2. Got 768 and 736 (The offending index is 2)

1080p works fine, it seems to be some scaling/cropping issue?

I can reliably reproduce this problem on 4.0, but not on 3.9, so it seems to be caused by IFNet_HDv3.py changes.

hzwer commented 2 years ago

https://github.com/hzwer/Practical-RIFE/blob/66a4e2917e7e23a52ba467ee77ba1617962718ea/inference_video.py#L196 Increasing this padding number to 128 can fix this issue.

SportsmanLee commented 2 years ago

Hi, is this RIFE 4.0 model trained with perceptual loss? Thx.

hzwer commented 2 years ago

@SportsmanLee Yes, 0.5 l1 loss + 0.5 perceptual loss -> https://github.com/hzwer/Practical-RIFE/blob/66a4e2917e7e23a52ba467ee77ba1617962718ea/model/loss.py#L98