Open clumsynope opened 3 years ago
@clumsynope I have doubt, I am not sure how to create multiple target images and poses from a given source image?
I am following the examples/inference.ipynb
but I all I can get is since image, how to generate multiple target images/ poses?
Hi, @clumsynope Thanks for your inspiring awesome project! I'm also curious about the same question with @prraoo. I'd appreciated if you could share your solution or code example for multiple target images processing.
I very appreciate your creative work and sharing this code. But I found video jitter during the video test. The following is my result, which consists of source_img, target_img, target_pose, pred_img. I've replaced the more stable face detector, but I have the same problem. In https://github.com/saic-violet/bilayer-model/blob/master/infer.py, I found image and pose are center-aligned as the following, which I think may cause the video jitter problem, Can we remove the center-aligned operation in the data process and finetune the model again? Wait for your reply. Thank you very much. @egorzakharov
@chejinsuk To solve my issue, I just wrote an inference script as follows
##Load the model
args_dict = {
'project_dir': '../',
'init_experiment_dir': '../runs/vc2-hq_adrianb_paper_main',
'init_networks': 'identity_embedder, texture_generator, keypoints_embedder, inference_generator',
'init_which_epoch': '2225',
'num_gpus': 1,
'experiment_name': 'vc2-hq_adrianb_paper_enhancer',
'which_epoch': '1225',
'spn_networks': 'identity_embedder, texture_generator, keypoints_embedder, inference_generator, texture_enhancer',
'enh_apply_masks': False,
'inf_apply_masks': False}
module = InferenceWrapper(args_dict)
## your input frames location here:
target_folder = natsorted(glob.glob('images/target_video/*.png'))
target_image_list = []
for img in target_folder:
target_image_list.append(np.asarray(Image.open(img)))
input_data_dict = {
'source_imgs': np.asarray(Image.open('images/source.jpg')), # H x W x 3
'target_imgs': np.array(target_image_list) } # B x H x W x # 3
output_data_dict = module(input_data_dict)
print(output_data_dict['pred_enh_target_imgs'].shape)
def to_image(img_tensor, seg_tensor=None):
img_array = ((img_tensor.clamp(-1, 1).cpu().numpy() + 1) / 2).transpose(1, 2, 0) * 255
if seg_tensor is not None:
seg_array = seg_tensor.cpu().numpy().transpose(1, 2, 0)
img_array = img_array * seg_array + 255. * (1 - seg_array)
return Image.fromarray(img_array.astype('uint8'))
for i in range(len(target_image_list)):
pred_img = to_image(output_data_dict['pred_enh_target_imgs'][0, i], output_data_dict['pred_target_segs'][0, i])
# save location
if not os.path.exists("results/"):
os.makedirs("results")
pred_img.save("results/{}.png".format(str(i)))
I very appreciate your creative work and sharing this code. But I found video jitter during the video test. The following is my result, which consists of source_img, target_img, target_pose, pred_img. I've replaced the more stable face detector, but I have the same problem. In https://github.com/saic-violet/bilayer-model/blob/master/infer.py, I found image and pose are center-aligned as the following, which I think may cause the video jitter problem, Can we remove the center-aligned operation in the data process and finetune the model again? Wait for your reply. Thank you very much. @egorzakharov