Open Katzenwerfer opened 3 years ago
1.change the testset_root at configs/config_test_w_sgm.py 2.create some folders at the testset_root, put frame1.jpg frame2.jpg frame3.jpg (frame2.jpg only needs to have the same resolution as the other two pictures No need for similar image) 3.Rename folders which you created to the folder name contained in datasets/test_2k_pre_calc_sgm_flows 4.remove calculations like PSNR in test_anime_sequence_one_by_one.py
def save_flow_to_img(flow, des):
f = flow[0].data.cpu().numpy().transpose([1, 2, 0])
fcopy = f.copy()
fcopy[:, :, 0] = f[:, :, 1]
fcopy[:, :, 1] = f[:, :, 0]
cf = flow_to_color(-fcopy)
cv2.imwrite(des + '.jpg', cf)
def validate(config):
# preparing datasets & normalization
normalize1 = TF.Normalize(config.mean, [1.0, 1.0, 1.0])
normalize2 = TF.Normalize([0, 0, 0], config.std)
trans = TF.Compose([TF.ToTensor(), normalize1, normalize2, ])
revmean = [-x for x in config.mean]
revstd = [1.0 / x for x in config.std]
revnormalize1 = TF.Normalize([0.0, 0.0, 0.0], revstd)
revnormalize2 = TF.Normalize(revmean, [1.0, 1.0, 1.0])
revNormalize = TF.Compose([revnormalize1, revnormalize2])
revtrans = TF.Compose([revnormalize1, revnormalize2, TF.ToPILImage()])
testset = datas.AniTripletWithSGMFlowTest(config.testset_root, config.test_flow_root, trans, config.test_size, config.test_crop_size, train=False)
sampler = torch.utils.data.SequentialSampler(testset)
validationloader = torch.utils.data.DataLoader(testset, sampler=sampler, batch_size=1, shuffle=False, num_workers=1)
to_img = TF.ToPILImage()
print(testset)
sys.stdout.flush()
# prepare model
model = getattr(models, config.model)(config.pwc_path).cuda()
model = nn.DataParallel(model)
retImg = []
# load weights
dict1 = torch.load(config.checkpoint)
model.load_state_dict(dict1['model_state_dict'], strict=False)
# prepare others
store_path = config.store_path
folders = []
print('Everything prepared. Ready to test...')
sys.stdout.flush()
# start testing...
with torch.no_grad():
model.eval()
ii = 0
for validationIndex, validationData in enumerate(validationloader, 0):
print('Testing {}/{}-th group...'.format(validationIndex, len(testset)))
sys.stdout.flush()
sample, flow, index, folder = validationData
frame0 = None
frame1 = sample[0]
frame3 = None
frame2 = sample[-1]
folders.append(folder[0][0])
# initial SGM flow
F12i, F21i = flow
F12i = F12i.float().cuda()
F21i = F21i.float().cuda()
ITs = [sample[tt] for tt in range(1, 2)]
I1 = frame1.cuda()
I2 = frame2.cuda()
if not os.path.exists(config.store_path + '/' + folder[0][0]):
os.mkdir(config.store_path + '/' + folder[0][0])
revtrans(I1.cpu()[0]).save(store_path + '/' + folder[0][0] + '/' + index[0][0] + '.jpg')
revtrans(I2.cpu()[0]).save(store_path + '/' + folder[-1][0] + '/' + index[-1][0] + '.jpg')
for tt in range(config.inter_frames):
x = config.inter_frames
t = 1.0/(x+1) * (tt + 1)
outputs = model(I1, I2, F12i, F21i, t)
It_warp = outputs[0]
to_img(revNormalize(It_warp.cpu()[0]).clamp(0.0, 1.0)).save(store_path + '/' + folder[1][0] + '/' + index[1][0] + '.png')
save_flow_to_img(outputs[1].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F12')
save_flow_to_img(outputs[2].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F21')
if __name__ == "__main__":
# loading configures
parser = argparse.ArgumentParser()
parser.add_argument('config')
args = parser.parse_args()
config = Config.from_file(args.config)
if not os.path.exists(config.store_path):
os.mkdir(config.store_path)
validate(config)
If you want to add frames to a complete video, I guess the code needs to be changed a lot
1.change the testset_root at configs/config_test_w_sgm.py 2.create some folders at the testset_root, put frame1.jpg frame2.jpg frame3.jpg (frame2.jpg only needs to have the same resolution as the other two pictures No need for similar image) 3.Rename folders which you created to the folder name contained in datasets/test_2k_pre_calc_sgm_flows 4.remove calculations like PSNR in test_anime_sequence_one_by_one.py
def save_flow_to_img(flow, des): f = flow[0].data.cpu().numpy().transpose([1, 2, 0]) fcopy = f.copy() fcopy[:, :, 0] = f[:, :, 1] fcopy[:, :, 1] = f[:, :, 0] cf = flow_to_color(-fcopy) cv2.imwrite(des + '.jpg', cf) def validate(config): # preparing datasets & normalization normalize1 = TF.Normalize(config.mean, [1.0, 1.0, 1.0]) normalize2 = TF.Normalize([0, 0, 0], config.std) trans = TF.Compose([TF.ToTensor(), normalize1, normalize2, ]) revmean = [-x for x in config.mean] revstd = [1.0 / x for x in config.std] revnormalize1 = TF.Normalize([0.0, 0.0, 0.0], revstd) revnormalize2 = TF.Normalize(revmean, [1.0, 1.0, 1.0]) revNormalize = TF.Compose([revnormalize1, revnormalize2]) revtrans = TF.Compose([revnormalize1, revnormalize2, TF.ToPILImage()]) testset = datas.AniTripletWithSGMFlowTest(config.testset_root, config.test_flow_root, trans, config.test_size, config.test_crop_size, train=False) sampler = torch.utils.data.SequentialSampler(testset) validationloader = torch.utils.data.DataLoader(testset, sampler=sampler, batch_size=1, shuffle=False, num_workers=1) to_img = TF.ToPILImage() print(testset) sys.stdout.flush() # prepare model model = getattr(models, config.model)(config.pwc_path).cuda() model = nn.DataParallel(model) retImg = [] # load weights dict1 = torch.load(config.checkpoint) model.load_state_dict(dict1['model_state_dict'], strict=False) # prepare others store_path = config.store_path folders = [] print('Everything prepared. Ready to test...') sys.stdout.flush() # start testing... with torch.no_grad(): model.eval() ii = 0 for validationIndex, validationData in enumerate(validationloader, 0): print('Testing {}/{}-th group...'.format(validationIndex, len(testset))) sys.stdout.flush() sample, flow, index, folder = validationData frame0 = None frame1 = sample[0] frame3 = None frame2 = sample[-1] folders.append(folder[0][0]) # initial SGM flow F12i, F21i = flow F12i = F12i.float().cuda() F21i = F21i.float().cuda() ITs = [sample[tt] for tt in range(1, 2)] I1 = frame1.cuda() I2 = frame2.cuda() if not os.path.exists(config.store_path + '/' + folder[0][0]): os.mkdir(config.store_path + '/' + folder[0][0]) revtrans(I1.cpu()[0]).save(store_path + '/' + folder[0][0] + '/' + index[0][0] + '.jpg') revtrans(I2.cpu()[0]).save(store_path + '/' + folder[-1][0] + '/' + index[-1][0] + '.jpg') for tt in range(config.inter_frames): x = config.inter_frames t = 1.0/(x+1) * (tt + 1) outputs = model(I1, I2, F12i, F21i, t) It_warp = outputs[0] to_img(revNormalize(It_warp.cpu()[0]).clamp(0.0, 1.0)).save(store_path + '/' + folder[1][0] + '/' + index[1][0] + '.png') save_flow_to_img(outputs[1].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F12') save_flow_to_img(outputs[2].cpu(), store_path + '/' + folder[1][0] + '/' + index[1][0] + '_F21') if __name__ == "__main__": # loading configures parser = argparse.ArgumentParser() parser.add_argument('config') args = parser.parse_args() config = Config.from_file(args.config) if not os.path.exists(config.store_path): os.mkdir(config.store_path) validate(config)
Thanks! But please don't rename the "new frame" folders to pre-computed SGM flows. Otherwise, the wrong initial flow will definitely mislead the network...
The correct way to is to generate new SGM flows for your own data using code in models/sgm_model.
We will try to write a guidance, as soon as we can...
The script (with specific hyper-parameters) to generate sgm flow for custom data is in great request.
The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result
I have a question that is semi-related to this issue, please tell me if I should start a new one.
If I want to run AnimeInterp on a custom cartoon, should I generate optical flow for it beforehand? Does it make sense to generate it ad-hoc or would it make inference much more inefficient? Is there a way to run this without precomputed optical flow at all?
I have a question that is semi-related to this issue, please tell me if I should start a new one.
If I want to run AnimeInterp on a custom cartoon, should I generate optical flow for it beforehand? Does it make sense to generate it ad-hoc or would it make inference much more inefficient? Is there a way to run this without precomputed optical flow at all?
According to my test, no matter what kind of optical flow is input, even if it is empty optical flow or even error optical flow, the impact on subsequent optical flow is not great, and the impact on the generated frame is smaller.
The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result
@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.
@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议
The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result
@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.
@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议
It's interesting and I just have seen it. I scan the method in the paper with a few minutes and may not understand the paper well. And firstly, I need to read it carefully.
The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result
Thanks for your comments! Would you please share more details on this experiment? Besides, one thing I wish to ask is that did you replace the .npy files in pre_calc_sgm_flows, or replace the .jpg visualizations only?
The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result
@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.
@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议
Assuming u are in the right direction, my guess is as follows.
Firstly, note that the training contains 3 steps:
Empty/zeros flow also works: Because of step-1 and step-2, the method should also work when u input empty/zeros flow (It's just same as in RAFT, both zeros_flow/warm_start_flow_initialization works). The RFR may be robust to correct/refine the init_flow.
Impact on the generated frame is smaller: I suggest that test some more extreme clips to check the improvement. Common cases may not show it's effect. As can be seen in the paper, the psnr improvement gets larger from easy to hard case, although it's not as obvious as the RFR-module. I guess that the RFR-module behaves about the same as other cnn-flow methods and hence, it play a important role in the interpolation task. The sgm module only works as a better warm_initialization for some extreme cases.
Hope that will be some help for u. @98mxr
The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result
Thanks for your comments! Would you please share more details on this experiment? Besides, one thing I wish to ask is that did you replace the .npy files in pre_calc_sgm_flows, or replace the .jpg visualizations only?
replace the .npy files
I have been communicating extensively with @YiWeiHuang-stack We use the same test method
The test code I used may read the pre-computed sgm flow, but I replaced the optical flow in pre_calc_sgm_flow folder with the optical flow generated by RAFT, which did not affect the final export result
@lisiyao21 A little confused about this. Dose it mean that RAFT(with end-to-end fine-tuning) is approximately equal to SGM+RFT(raft also contains recurrent refinement)? As my understanding, the SGM should be effective for the problem particular to animation interpolation.
@lhao0301 @ #13 我在这个issue做了一点有趣的测试,还请提出建议
Assuming u are in the right direction, my guess is as follows.
Firstly, note that the training contains 3 steps:
- Pre-train the RFR as RAFT from scratch. The init_flow now is initialized from zeros.
- Fix RFR and train RFR+interpolation_net on realistic high-fps video data. The init_flow is initialized from zeros also.
- Add SGM and fine-tune RFR+interpolation_net on the proposed animation dataset.
Empty/zeros flow also works: Because of step-1 and step-2, the method should also work when u input empty/zeros flow (It's just same as in RAFT, both zeros_flow/warm_start_flow_initialization works). The RFR may be robust to correct/refine the init_flow.
Impact on the generated frame is smaller: I suggest that test some more extreme clips to check the improvement. Common cases may not show it's effect. As can be seen in the paper, the psnr improvement gets larger from easy to hard case, although it's not as obvious as the RFR-module. I guess that the RFR-module behaves about the same as other cnn-flow methods and hence, it play a important role in the interpolation task. The sgm module only works as a better warm_initialization for some extreme cases.
Hope that will be some help for u. @98mxr
Hi. Thanks for your interest.
As to the effect of the SGM module, it could be referred the ablation study part of our paper. Generally, the SGM module can make improvements on the interpolated results (0.14dB on average), especially for the part of cases in large motion. The SSIM values are indeed similar, which is the same as reported in the paper.
Also, it may not be suitable to measure the quality of flows used in video interpolation by seeing the appearance of visualization (e.g. the boundary) since the flow network will be tuned to fit the interpolation task better. This point could be referred to a very good paper "Video Enhancement with Task-Oriented Flow" by Xue et al.
Another point on whether the pre-computed SGM step can be ignored:
The implementation of SGM is based on color piece segmentation and matching in "for loops" on CPUs. If the test frames contain too many color pieces, the SGM module would be slow. So we split the SGM as a pre-calculated part. On the consideration of time efficiency on generating a long piece of video, one could optionally modify this model into w/o. SGM version. But the results of a whole model should be reported for formal reports (e.g. in a paper).
I got the code running with the provided dataset, but I would prefer to test with custom frames. Is there any way to achieve this on the current code, or would it need to be implemented?