saic-violet / bilayer-model

Mozilla Public License 2.0
246 stars 48 forks source link

video jitter #14

Open clumsynope opened 3 years ago

clumsynope commented 3 years ago

I very appreciate your creative work and sharing this code. But I found video jitter during the video test. The following is my result, which consists of source_img, target_img, target_pose, pred_img. result I've replaced the more stable face detector, but I have the same problem. In https://github.com/saic-violet/bilayer-model/blob/master/infer.py, I found image and pose are center-aligned as the following, which I think may cause the video jitter problem, Can we remove the center-aligned operation in the data process and finetune the model again? 20210203114504 Wait for your reply. Thank you very much. @egorzakharov

prraoo commented 3 years ago

@clumsynope I have doubt, I am not sure how to create multiple target images and poses from a given source image?

I am following the examples/inference.ipynb but I all I can get is since image, how to generate multiple target images/ poses?

chejinsuk commented 3 years ago

Hi, @clumsynope Thanks for your inspiring awesome project! I'm also curious about the same question with @prraoo. I'd appreciated if you could share your solution or code example for multiple target images processing.

I very appreciate your creative work and sharing this code. But I found video jitter during the video test. The following is my result, which consists of source_img, target_img, target_pose, pred_img. result I've replaced the more stable face detector, but I have the same problem. In https://github.com/saic-violet/bilayer-model/blob/master/infer.py, I found image and pose are center-aligned as the following, which I think may cause the video jitter problem, Can we remove the center-aligned operation in the data process and finetune the model again? 20210203114504 Wait for your reply. Thank you very much. @egorzakharov

prraoo commented 3 years ago

@chejinsuk To solve my issue, I just wrote an inference script as follows

##Load the model

args_dict = {
    'project_dir': '../',
    'init_experiment_dir': '../runs/vc2-hq_adrianb_paper_main',
    'init_networks': 'identity_embedder, texture_generator, keypoints_embedder, inference_generator',
    'init_which_epoch': '2225',
    'num_gpus': 1,
    'experiment_name': 'vc2-hq_adrianb_paper_enhancer',
    'which_epoch': '1225',
    'spn_networks': 'identity_embedder, texture_generator, keypoints_embedder, inference_generator, texture_enhancer',
    'enh_apply_masks': False,
    'inf_apply_masks': False}

module = InferenceWrapper(args_dict)

## your input frames location here:
target_folder = natsorted(glob.glob('images/target_video/*.png'))
target_image_list = []

for img in target_folder:
    target_image_list.append(np.asarray(Image.open(img)))

input_data_dict = {
    'source_imgs': np.asarray(Image.open('images/source.jpg')), # H x W x 3
    'target_imgs': np.array(target_image_list) } # B x H x W x # 3

output_data_dict = module(input_data_dict)
print(output_data_dict['pred_enh_target_imgs'].shape)

def to_image(img_tensor, seg_tensor=None):
    img_array = ((img_tensor.clamp(-1, 1).cpu().numpy() + 1) / 2).transpose(1, 2, 0) * 255

    if seg_tensor is not None:
        seg_array = seg_tensor.cpu().numpy().transpose(1, 2, 0)
        img_array = img_array * seg_array + 255. * (1 - seg_array)

    return Image.fromarray(img_array.astype('uint8'))

for i in range(len(target_image_list)):
    pred_img = to_image(output_data_dict['pred_enh_target_imgs'][0, i], output_data_dict['pred_target_segs'][0, i])
    # save location
    if not os.path.exists("results/"):
        os.makedirs("results")
    pred_img.save("results/{}.png".format(str(i)))