RenYurui / PIRender

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"
Other
514 stars 65 forks source link

Motion Imitation custom images #5

Open molo32 opened 3 years ago

molo32 commented 3 years ago

How motion imitation custom images

RenYurui commented 3 years ago

Hi, The cross-identity reenactment method will be provided in the next few days. You can use the code to animate your images. Yurui

clumsynope commented 2 years ago

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following:

https://user-images.githubusercontent.com/67719349/136723007-4e4fa93e-a986-4778-8714-be8e2deb3458.mp4

Can you help me fix the problem?

loboere commented 2 years ago

hi @clumsynope, can you tell me how to run the cross-identity? are you using a custom image?

kjzju commented 2 years ago

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following:

out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

DWCTOD commented 2 years ago

@RenYurui Thanks you share your greate work. I have tested the cross-identity reenactment,but not very good.

https://user-images.githubusercontent.com/41478810/137077395-fc8f50e0-d2d4-44c0-ba33-0e8803de2a52.mp4

DWCTOD commented 2 years ago

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following: out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

Hi, would mind share details about how to replace the crop parameter with the source's crop parameter. Thanks

kjzju commented 2 years ago

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following: out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

Hi, would mind share details about how to replace the crop parameter with the source's crop parameter. Thanks

In Line 33 of vox_video_dataset.py, replace the last 3 dimensions of semantics_numpy with the source's, like semantics_numpy[-3:]=source_semantics_numpy[-3:]. As for source_semantics_numpy, you can get it like how you get semantics_numpy.

https://user-images.githubusercontent.com/26479528/137242017-e1a5dcd6-bda1-446f-b0b7-e3f40d72e35a.mp4

DWCTOD commented 2 years ago

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following: out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

Hi, would mind share details about how to replace the crop parameter with the source's crop parameter. Thanks

In Line 33 of vox_video_dataset.py, replace the last 3 dimensions of semantics_numpy with the source's, like semantics_numpy[-3:]=source_semantics_numpy[-3:]. As for source_semantics_numpy, you can get it like how you get semantics_numpy.

id10283.vaK4t1-WD4M.031553.031737.mp4

Thanks

DWCTOD commented 2 years ago

@kjzju

你好,我按照你的提示修改并测试了一下, 感觉还是有一些问题。cross id 的时候,外貌特征也会被迁移过去。 感觉 ID 信息并没有传递过去,下面是 vox_dataset.py 中的代码,不知道大佬具体是如何修改的。 ` def transform_semantic(self, semantic, frame_index): index = self.obtain_seq_index(frame_index, semantic.shape[0]) coeff_3dmm = semantic[index,...]

id_coeff = coeff_3dmm[:,:80] #identity

    ex_coeff = coeff_3dmm[:,80:144] #expression
    # tex_coeff = coeff_3dmm[:,144:224] #texture
    angles = coeff_3dmm[:,224:227] #euler angles for pose
    # gamma = coeff_3dmm[:,227:254] #lighting
    translation = coeff_3dmm[:,254:257] #translation
    crop = coeff_3dmm[:,257:300] #crop param
    coeff_3dmm = np.concatenate([ex_coeff, angles, translation, crop], 1)
    return torch.Tensor(coeff_3dmm).permute(1,0)

`

kjzju commented 2 years ago

I just modify the code in vox_video_dataset.py like this:

def load_next_video(self):
    data={}
    self.video_index += 1
    video_item = self.video_items[self.video_index]

    video_item_src = self.video_items[random.randint(0, len(self.video_items))] # random select another source video

    with self.env.begin(write=False) as txn:
        # key = format_for_lmdb(video_item['video_name'], 0)
        # img_bytes_1 = txn.get(key)
        # img1 = Image.open(BytesIO(img_bytes_1))
        # data['source_image'] = self.transform(img1)

        # cross-identity
        key = format_for_lmdb(video_item_src['video_name'], 0)
        img_bytes_1 = txn.get(key)
        img1 = Image.open(BytesIO(img_bytes_1))
        data['source_image'] = self.transform(img1) # source image

        semantics_key = format_for_lmdb(video_item_src['video_name'], 'coeff_3dmm')
        semantics_numpy = np.frombuffer(txn.get(semantics_key), dtype=np.float32)
        semantics_numpy = semantics_numpy.reshape((video_item_src['num_frame'], -1))
        source_semantics = self.transform_semantic(semantics_numpy, 0) # source semantic numpy

        semantics_key = format_for_lmdb(video_item['video_name'], 'coeff_3dmm')
        semantics_numpy = np.frombuffer(txn.get(semantics_key), dtype=np.float32)
        semantics_numpy = semantics_numpy.reshape((video_item['num_frame'],-1)) # target semantic numpy

        data['target_image'], data['target_semantics'] = [], []
        for frame_index in range(video_item['num_frame']):
            key = format_for_lmdb(video_item['video_name'], frame_index)
            img_bytes_1 = txn.get(key)
            img1 = Image.open(BytesIO(img_bytes_1))
            data['target_image'].append(self.transform(img1))
            target_semantics = self.transform_semantic(semantics_numpy, frame_index)

            target_semantics[-3:] = source_semantics[-3:] # replace the crop parameters of the target's with the source's

            data['target_semantics'].append(
                # self.transform_semantic(semantics_numpy, frame_index)
                target_semantics
            )
        data['video_name'] = video_item['video_name']
    return data
DWCTOD commented 2 years ago

@kjzju Thanks, I will try again !

josh-zhu commented 2 years ago

@DWCTOD have you figured out the issue about the cross identity motion imitation? I do agree with you that the identity information of the source image is not passed to generator model , and the result seems not preserving the source identity very well. @RenYurui, Have you tried to pass an extra more source identity coefficient to the generator model? Hope for your reply, thanks.

DWCTOD commented 2 years ago

@DWCTOD have you figured out the issue about the cross identity motion imitation? I do agree with you that the identity information of the source image is not passed to generator model , and the result seems not preserving the source identity very well. @RenYurui, Have you tried to pass an extra more source identity coefficient to the generator model? Hope for your reply, thanks.

你好,这个问题还没有解决, 但是我猜测可能这部分的工作还没有分享出来,简单的通过3DMM参数传递应该是得不到作者demo的效果,应该是需要一个facial motion retarget的模块才能将cross-id 的运动比较自然有效的迁移过去

loboere commented 2 years ago

how use custom images instead video_item['video_name']