Justin62628 / Squirrel-RIFE

效果更好的补帧软件,显存占用更小,是DAIN速度的10-25倍,包含抽帧处理,去除动漫卡顿感
GNU General Public License v3.0
3.11k stars 176 forks source link

不破坏动漫人物运动的补帧方法 (仅GMFSS适用)(仅支持整数倍补帧) #530

Closed routineLife1 closed 1 year ago

routineLife1 commented 1 year ago

有ABC三帧,计算光流floBA,floBC,分别求模长并相除,缩小模长大的一侧的模长值,将该操作反应至时刻地图(timestep map)中。当遇到一拍二,三画面或者更高节拍画面时受动漫人物运动特性的影响,提出的操作可以限制人物运动,只对背景进行补偿,来达到尽可能保护动漫人物帧的目的。(该操作耗时小)

实现代码:

def gen_timestep_map(flo_ba, flo_bc, ln, rn):
    x_ba, y_ba = flo_ba[:, :1, :, :], flo_ba[:, 1:2, :, :]
    x_bc, y_bc = flo_bc[:, :1, :, :], flo_bc[:, 1:2, :, :]
    distance_ba = torch.sqrt(torch.pow(x_ba, 2) + torch.pow(y_ba, 2))
    distance_bc = torch.sqrt(torch.pow(x_bc, 2) + torch.pow(y_bc, 2))
    mod_bid = distance_ba / distance_bc
    l_timestep, r_timestep = torch.ones_like(mod_bid), torch.ones_like(mod_bid)
    l_timestep[mod_bid >= 1] /= mod_bid[mod_bid >= 1]
    r_timestep[mod_bid < 1] *= mod_bid[mod_bid < 1]

    l_t_array = [_i / (ln + 1) for _i in range(1, ln + 1)]
    r_t_array = [_i / (rn + 1) for _i in range(1, rn + 1)]

    t_ab = [l_timestep * t for t in l_t_array]
    t_bc = [r_timestep * t for t in r_t_array]

    # timestep map: 1, 1, h, w
    return t_ab, t_bc
    def inference2(self, img0, img1, reuse_things, timestep):
        flow01, metric0, feat11, feat12, feat13 = reuse_things[0], reuse_things[2], reuse_things[4][0], reuse_things[4][
            1], reuse_things[4][2]
        flow10, metric1, feat21, feat22, feat23 = reuse_things[1], reuse_things[3], reuse_things[5][0], reuse_things[5][
            1], reuse_things[5][2]

        F1t = timestep * flow01
        F2t = (1 - timestep) * flow10

        Z1t = timestep * metric0
        Z2t = (1 - timestep) * metric1

        img0 = F.interpolate(img0, scale_factor=0.5, mode="bilinear", align_corners=False)
        I1t = warp(img0, F1t, Z1t, strMode='soft')
        img1 = F.interpolate(img1, scale_factor=0.5, mode="bilinear", align_corners=False)
        I2t = warp(img1, F2t, Z2t, strMode='soft')

        feat1t1 = warp(feat11, F1t, Z1t, strMode='soft')
        feat2t1 = warp(feat21, F2t, Z2t, strMode='soft')

        F1td = F.interpolate(F1t, scale_factor=0.5, mode="bilinear", align_corners=False) * 0.5
        Z1d = F.interpolate(Z1t, scale_factor=0.5, mode="bilinear", align_corners=False)
        feat1t2 = warp(feat12, F1td, Z1d, strMode='soft')
        F2td = F.interpolate(F2t, scale_factor=0.5, mode="bilinear", align_corners=False) * 0.5
        Z2d = F.interpolate(Z2t, scale_factor=0.5, mode="bilinear", align_corners=False)
        feat2t2 = warp(feat22, F2td, Z2d, strMode='soft')

        F1tdd = F.interpolate(F1t, scale_factor=0.25, mode="bilinear", align_corners=False) * 0.25
        Z1dd = F.interpolate(Z1t, scale_factor=0.25, mode="bilinear", align_corners=False)
        feat1t3 = warp(feat13, F1tdd, Z1dd, strMode='soft')
        F2tdd = F.interpolate(F2t, scale_factor=0.25, mode="bilinear", align_corners=False) * 0.25
        Z2dd = F.interpolate(Z2t, scale_factor=0.25, mode="bilinear", align_corners=False)
        feat2t3 = warp(feat23, F2tdd, Z2dd, strMode='soft')

        out = self.fusionnet(torch.cat([img0, I1t, I2t, img1], dim=1), torch.cat([feat1t1, feat2t1], dim=1),
                             torch.cat([feat1t2, feat2t2], dim=1), torch.cat([feat1t3, feat2t3], dim=1))

        return torch.clamp(out, 0, 1)

使用实例:

def make_inference(I0, I1, I2, ln, rn, scale):
    global model
    reusethings0 = model.reuse(I0, I1, scale)
    reusethings1 = model.reuse(I1, I2, scale)
    flow10, flow12 = reusethings0[1], reusethings1[0]
    t01, t12 = gen_timestep_map(flow10, flow12, ln, rn)

    output = []

    for i in range(len(t01)):
        out = model.inference2(I0, I1, reusethings0, t01[i])
        out = out.squeeze(0).permute(1, 2, 0).cpu().numpy() * 255.
        output.append(out)

    output.append(I1.squeeze(0).permute(1, 2, 0).cpu().numpy() * 255.)

    for i in range(len(t12)):
        out = model.inference2(I1, I2, reusethings1, t12[i])
        out = out.squeeze(0).permute(1, 2, 0).cpu().numpy() * 255.
        output.append(out)

    return output

可能的提升效果训练方案: 准备一拍三画面i0,i1,i2,人物相对i2运动的下一帧i3。i0,i1,i3作为输入i2作为GT,数据不需要很标准。

Justin62628 commented 1 year ago

SVFI 5.0.26