Open styler00dollar opened 2 years ago
According to your needs, I think you can add the following ensemble inference function to IFRNet.py
, IFRNet_L.py
and IFRNet_S.py
def inference_ensemble(self, img0, img1, embt, scale_factor=1.0):
imgt_pred_1 = self.inference(img0, img1, embt, scale_factor)
imgt_pred_2 = self.inference(img1, img0, 1-embt, scale_factor)
imgt_pred = (imgt_pred_1 + imgt_pred_2) / 2.0
return imgt_pred
I was rather thinking of applying ensembling on the flow rather than the end result, rife does this after every block and merges the flow, but that is also an interesting approach I could test.
As reference of what I was thinking prior:
for i in range(4):
if flow is None:
flow, mask = block[i](
torch.cat((img0[:, :3], img1[:, :3], timestep), 1),
None,
scale=scale_list[i],
)
if ensemble:
f1, m1 = block[i](
torch.cat((img1[:, :3], img0[:, :3], 1 - timestep), 1),
None,
scale=scale_list[i],
)
flow = (flow + torch.cat((f1[:, 2:4], f1[:, :2]), 1)) / 2
I would need to figure out how to implement something like that for IFRNet, well still thanks.
I think your suggestion is better than what I have done above. Since directly ensembling the final results will cause blurry texture, while ensembling intermediate optical flow does not have this problem. I will add a new function ensemble_inference
according to your suggestion later.
Thank you. I will wait for it. :)
Ensembling is used in frame interpolation to drastically improve visual quality by using different predictions and generating a mean. Here is a paper talking about ensembling.
Rife does use it as well and uses 2 predictions, which can be seen here. It should not be very hard to add. I would trade off some speed to have more quality. Thanks.