lisiyao21 / AnimeInterp

The code for CVPR21 paper "Deep Animation Video Interpolation in the Wild"
402 stars 39 forks source link

Run custom frames #11

Open Katzenwerfer opened 3 years ago

Katzenwerfer commented 3 years ago

I got the code running with the provided dataset, but I would prefer to test with custom frames. Is there any way to achieve this on the current code, or would it need to be implemented?

routineLife1 commented 3 years ago

1.change the testset_root at configs/config_test_w_sgm.py 2.create some folders at the testset_root, put frame1.jpg frame2.jpg frame3.jpg (frame2.jpg only needs to have the same resolution as the other two pictures No need for similar image) 3.Rename folders which you created to the folder name contained in datasets/test_2k_pre_calc_sgm_flows 4.remove calculations like PSNR in test_anime_sequence_one_by_one.py

def save_flow_to_img(flow, des):
        f = flow[0].data.cpu().numpy().transpose([1, 2, 0])
        fcopy = f.copy()
        fcopy[:, :, 0] = f[:, :, 1]
        fcopy[:, :, 1] = f[:, :, 0]
        cf = flow_to_color(-fcopy)
        cv2.imwrite(des + '.jpg', cf)

def validate(config):   
    # preparing datasets & normalization
    normalize1 = TF.Normalize(config.mean, [1.0, 1.0, 1.0])
    normalize2 = TF.Normalize([0, 0, 0], config.std)
    trans = TF.Compose([TF.ToTensor(), normalize1, normalize2, ])

    revmean = [-x for x in config.mean]
    revstd = [1.0 / x for x in config.std]
    revnormalize1 = TF.Normalize([0.0, 0.0, 0.0], revstd)
    revnormalize2 = TF.Normalize(revmean, [1.0, 1.0, 1.0])
    revNormalize = TF.Compose([revnormalize1, revnormalize2])

    revtrans = TF.Compose([revnormalize1, revnormalize2, TF.ToPILImage()])

    testset = datas.AniTripletWithSGMFlowTest(config.testset_root, config.test_flow_root, trans, config.test_size, config.test_crop_size, train=False)
    sampler = torch.utils.data.SequentialSampler(testset)
    validationloader = torch.utils.data.DataLoader(testset, sampler=sampler, batch_size=1, shuffle=False, num_workers=1)
    to_img = TF.ToPILImage()

    print(testset)
    sys.stdout.flush()

    # prepare model
    model = getattr(models, config.model)(config.pwc_path).cuda()
    model = nn.DataParallel(model)
    retImg = []

    # load weights
    dict1 = torch.load(config.checkpoint)
    model.load_state_dict(dict1['model_state_dict'], strict=False)

    # prepare others
    store_path = config.store_path

    folders = []

    print('Everything prepared. Ready to test...')  
    sys.stdout.flush()

    #  start testing...
    with torch.no_grad():
        model.eval()
        ii = 0
        for validationIndex, validationData in enumerate(validationloader, 0):
            print('Testing {}/{}-th group...'.format(validationIndex, len(testset)))
            sys.stdout.flush()
            sample, flow,  index, folder = validationData

            frame0 = None
            frame1 = sample[0]
            frame3 = None
            frame2 = sample[-1]

            folders.append(folder[0][0])

            # initial SGM flow
            F12i, F21i  = flow

            F12i = F12i.float().cuda() 
            F21i = F21i.float().cuda()

            ITs = [sample[tt] for tt in range(1, 2)]
            I1 = frame1.cuda()
            I2 = frame2.cuda()

            if not os.path.exists(config.store_path + '/' + folder[0][0]):
                os.mkdir(config.store_path + '/' + folder[0][0])

            revtrans(I1.cpu()[0]).save(store_path + '/' + folder[0][0] + '/'  + index[0][0] + '.jpg')
            revtrans(I2.cpu()[0]).save(store_path + '/' + folder[-1][0] + '/' +  index[-1][0] + '.jpg')
            for tt in range(config.inter_frames):
                x = config.inter_frames
                t = 1.0/(x+1) * (tt + 1)

                outputs = model(I1, I2, F12i, F21i, t)

                It_warp = outputs[0]

                to_img(revNormalize(It_warp.cpu()[0]).clamp(0.0, 1.0)).save(store_path + '/' + folder[1][0] + '/' + index[1][0] + '.png')

                save_flow_to_img(outputs[1].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F12')
                save_flow_to_img(outputs[2].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F21')

if __name__ == "__main__":

    # loading configures
    parser = argparse.ArgumentParser()
    parser.add_argument('config')
    args = parser.parse_args()
    config = Config.from_file(args.config)

    if not os.path.exists(config.store_path):
        os.mkdir(config.store_path)

    validate(config)
routineLife1 commented 3 years ago

If you want to add frames to a complete video, I guess the code needs to be changed a lot

lisiyao21 commented 3 years ago

1.change the testset_root at configs/config_test_w_sgm.py 2.create some folders at the testset_root, put frame1.jpg frame2.jpg frame3.jpg (frame2.jpg only needs to have the same resolution as the other two pictures No need for similar image) 3.Rename folders which you created to the folder name contained in datasets/test_2k_pre_calc_sgm_flows 4.remove calculations like PSNR in test_anime_sequence_one_by_one.py

def save_flow_to_img(flow, des):
        f = flow[0].data.cpu().numpy().transpose([1, 2, 0])
        fcopy = f.copy()
        fcopy[:, :, 0] = f[:, :, 1]
        fcopy[:, :, 1] = f[:, :, 0]
        cf = flow_to_color(-fcopy)
        cv2.imwrite(des + '.jpg', cf)

def validate(config):   
    # preparing datasets & normalization
    normalize1 = TF.Normalize(config.mean, [1.0, 1.0, 1.0])
    normalize2 = TF.Normalize([0, 0, 0], config.std)
    trans = TF.Compose([TF.ToTensor(), normalize1, normalize2, ])

    revmean = [-x for x in config.mean]
    revstd = [1.0 / x for x in config.std]
    revnormalize1 = TF.Normalize([0.0, 0.0, 0.0], revstd)
    revnormalize2 = TF.Normalize(revmean, [1.0, 1.0, 1.0])
    revNormalize = TF.Compose([revnormalize1, revnormalize2])

    revtrans = TF.Compose([revnormalize1, revnormalize2, TF.ToPILImage()])

    testset = datas.AniTripletWithSGMFlowTest(config.testset_root, config.test_flow_root, trans, config.test_size, config.test_crop_size, train=False)
    sampler = torch.utils.data.SequentialSampler(testset)
    validationloader = torch.utils.data.DataLoader(testset, sampler=sampler, batch_size=1, shuffle=False, num_workers=1)
    to_img = TF.ToPILImage()

    print(testset)
    sys.stdout.flush()

    # prepare model
    model = getattr(models, config.model)(config.pwc_path).cuda()
    model = nn.DataParallel(model)
    retImg = []

    # load weights
    dict1 = torch.load(config.checkpoint)
    model.load_state_dict(dict1['model_state_dict'], strict=False)

    # prepare others
    store_path = config.store_path

    folders = []

    print('Everything prepared. Ready to test...')  
    sys.stdout.flush()

    #  start testing...
    with torch.no_grad():
        model.eval()
        ii = 0
        for validationIndex, validationData in enumerate(validationloader, 0):
            print('Testing {}/{}-th group...'.format(validationIndex, len(testset)))
            sys.stdout.flush()
            sample, flow,  index, folder = validationData

            frame0 = None
            frame1 = sample[0]
            frame3 = None
            frame2 = sample[-1]

            folders.append(folder[0][0])

            # initial SGM flow
            F12i, F21i  = flow

            F12i = F12i.float().cuda() 
            F21i = F21i.float().cuda()

            ITs = [sample[tt] for tt in range(1, 2)]
            I1 = frame1.cuda()
            I2 = frame2.cuda()

            if not os.path.exists(config.store_path + '/' + folder[0][0]):
                os.mkdir(config.store_path + '/' + folder[0][0])

            revtrans(I1.cpu()[0]).save(store_path + '/' + folder[0][0] + '/'  + index[0][0] + '.jpg')
            revtrans(I2.cpu()[0]).save(store_path + '/' + folder[-1][0] + '/' +  index[-1][0] + '.jpg')
            for tt in range(config.inter_frames):
                x = config.inter_frames
                t = 1.0/(x+1) * (tt + 1)

                outputs = model(I1, I2, F12i, F21i, t)

                It_warp = outputs[0]

                to_img(revNormalize(It_warp.cpu()[0]).clamp(0.0, 1.0)).save(store_path + '/' + folder[1][0] + '/' + index[1][0] + '.png')

                save_flow_to_img(outputs[1].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F12')
                save_flow_to_img(outputs[2].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F21')

if __name__ == "__main__":

    # loading configures
    parser = argparse.ArgumentParser()
    parser.add_argument('config')
    args = parser.parse_args()
    config = Config.from_file(args.config)

    if not os.path.exists(config.store_path):
        os.mkdir(config.store_path)

    validate(config)

Thanks! But please don't rename the "new frame" folders to pre-computed SGM flows. Otherwise, the wrong initial flow will definitely mislead the network...

The correct way to is to generate new SGM flows for your own data using code in models/sgm_model.

We will try to write a guidance, as soon as we can...

lhao0301 commented 3 years ago

The script (with specific hyper-parameters) to generate sgm flow for custom data is in great request.

routineLife1 commented 3 years ago

The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result

pepinu commented 3 years ago

I have a question that is semi-related to this issue, please tell me if I should start a new one.

If I want to run AnimeInterp on a custom cartoon, should I generate optical flow for it beforehand? Does it make sense to generate it ad-hoc or would it make inference much more inefficient? Is there a way to run this without precomputed optical flow at all?

98mxr commented 3 years ago

I have a question that is semi-related to this issue, please tell me if I should start a new one.

If I want to run AnimeInterp on a custom cartoon, should I generate optical flow for it beforehand? Does it make sense to generate it ad-hoc or would it make inference much more inefficient? Is there a way to run this without precomputed optical flow at all?

According to my test, no matter what kind of optical flow is input, even if it is empty optical flow or even error optical flow, the impact on subsequent optical flow is not great, and the impact on the generated frame is smaller.

98mxr commented 3 years ago

The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result

@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.

@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议

lhao0301 commented 3 years ago

The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result

@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.

@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议

It's interesting and I just have seen it. I scan the method in the paper with a few minutes and may not understand the paper well. And firstly, I need to read it carefully.

lisiyao21 commented 3 years ago

The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result

Thanks for your comments! Would you please share more details on this experiment? Besides, one thing I wish to ask is that did you replace the .npy files in pre_calc_sgm_flows, or replace the .jpg visualizations only?

lhao0301 commented 3 years ago

The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result

@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.

@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议

Assuming u are in the right direction, my guess is as follows.

Firstly, note that the training contains 3 steps:

  1. Pre-train the RFR as RAFT from scratch. The init_flow now is initialized from zeros.
  2. Fix RFR and train RFR+interpolation_net on realistic high-fps video data. The init_flow is initialized from zeros also.
  3. Add SGM and fine-tune RFR+interpolation_net on the proposed animation dataset.

Empty/zeros flow also works: Because of step-1 and step-2, the method should also work when u input empty/zeros flow (It's just same as in RAFT, both zeros_flow/warm_start_flow_initialization works). The RFR may be robust to correct/refine the init_flow.

Impact on the generated frame is smaller: I suggest that test some more extreme clips to check the improvement. Common cases may not show it's effect. As can be seen in the paper, the psnr improvement gets larger from easy to hard case, although it's not as obvious as the RFR-module. I guess that the RFR-module behaves about the same as other cnn-flow methods and hence, it play a important role in the interpolation task. The sgm module only works as a better warm_initialization for some extreme cases.

Hope that will be some help for u. @98mxr

routineLife1 commented 3 years ago

The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result

Thanks for your comments! Would you please share more details on this experiment? Besides, one thing I wish to ask is that did you replace the .npy files in pre_calc_sgm_flows, or replace the .jpg visualizations only?

replace the .npy files

98mxr commented 3 years ago

I have been communicating extensively with @YiWeiHuang-stack We use the same test method

lisiyao21 commented 3 years ago

The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result

@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.

@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议

Assuming u are in the right direction, my guess is as follows.

Firstly, note that the training contains 3 steps:

  1. Pre-train the RFR as RAFT from scratch. The init_flow now is initialized from zeros.
  2. Fix RFR and train RFR+interpolation_net on realistic high-fps video data. The init_flow is initialized from zeros also.
  3. Add SGM and fine-tune RFR+interpolation_net on the proposed animation dataset.

Empty/zeros flow also works: Because of step-1 and step-2, the method should also work when u input empty/zeros flow (It's just same as in RAFT, both zeros_flow/warm_start_flow_initialization works). The RFR may be robust to correct/refine the init_flow.

Impact on the generated frame is smaller: I suggest that test some more extreme clips to check the improvement. Common cases may not show it's effect. As can be seen in the paper, the psnr improvement gets larger from easy to hard case, although it's not as obvious as the RFR-module. I guess that the RFR-module behaves about the same as other cnn-flow methods and hence, it play a important role in the interpolation task. The sgm module only works as a better warm_initialization for some extreme cases.

Hope that will be some help for u. @98mxr

Hi. Thanks for your interest.

As to the effect of the SGM module, it could be referred the ablation study part of our paper. Generally, the SGM module can make improvements on the interpolated results (0.14dB on average), especially for the part of cases in large motion. The SSIM values are indeed similar, which is the same as reported in the paper.

Also, it may not be suitable to measure the quality of flows used in video interpolation by seeing the appearance of visualization (e.g. the boundary) since the flow network will be tuned to fit the interpolation task better. This point could be referred to a very good paper "Video Enhancement with Task-Oriented Flow" by Xue et al.

lisiyao21 commented 3 years ago

Another point on whether the pre-computed SGM step can be ignored:

The implementation of SGM is based on color piece segmentation and matching in "for loops" on CPUs. If the test frames contain too many color pieces, the SGM module would be slow. So we split the SGM as a pre-calculated part. On the consideration of time efficiency on generating a long piece of video, one could optionally modify this model into w/o. SGM version. But the results of a whole model should be reported for formal reports (e.g. in a paper).