neeek2303 / EMOPortraits

Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
311 stars 17 forks source link

Add video-driven pipeline script for portrait animation #22

Open liutaocode opened 2 weeks ago

liutaocode commented 2 weeks ago

Description

johndpope commented 2 weeks ago

on new versions of PIL

its complaining - Traceback (most recent call last): File "/media/oem/12TB/EMOPortraits/pipeline.py", line 220, in source_img = to_512(Image.open(args.source_image_path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/oem/12TB/EMOPortraits/pipeline.py", line 23, in to_512 = lambda x: x.resize((512, 512), Image.ANTIALIAS) ^^^^^^^^^^^^^^^ AttributeError: module 'PIL.Image' has no attribute 'ANTIALIAS'

find and replace ANTIALIAS -> LANCZOS fixes things

pip install pillow -U to recreate error.

The performance of this model seems very good. Did you do fps benchmark?

liutaocode commented 2 weeks ago

on new versions of PIL

its complaining - Traceback (most recent call last): File "/media/oem/12TB/EMOPortraits/pipeline.py", line 220, in source_img = to_512(Image.open(args.source_image_path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/oem/12TB/EMOPortraits/pipeline.py", line 23, in to_512 = lambda x: x.resize((512, 512), Image.ANTIALIAS) ^^^^^^^^^^^^^^^ AttributeError: module 'PIL.Image' has no attribute 'ANTIALIAS'

find and replace ANTIALIAS -> LANCZOS fixes things

pip install pillow -U to recreate error.

The performance of this model seems very good. Did you do fps benchmark?

@johndpope

Hello, I didn't encounter this error, and I directly used the Conda environment from the readme provided by the author. Besides, I haven't conducted the FPS benchmark, but from my experience running the model, the speed isn't particularly fast. Without optimizations, I guess it cannot achieve real-time performance directly. I estimate it's running at just a few FPS on my RTX 3090 hardware.

johndpope commented 2 weeks ago

fyi - there's a modnet instance already inside the inference wrapper -

it maybe the inference forward pass is already calling the mask (duplicating the ibug usage) source_mask_modnet = self.get_mask(source_img_crop)

it would be a step backwards if this is so crippled in performance compared to megaportraits.

UPDATE I see the infer pass ignores this - from my testing im seeing 9-10 fps.

fyi - I untangle args in the va model

args -> args ->args ->args ->args

https://gist.github.com/johndpope/3dda5ff978541564d78b7ef26a4c3661


from omegaconf import OmegaConf
class Model(nn.Module):

    def __init__(self, DELETETOOMANYARGUMENTS, training=True, rank=0, exp_dir=None):
        super(Model, self).__init__()

        self.exp_dir = exp_dir

        self.cfg = OmegaConf.load('./models/stage_1/volumetric_avatar/va.yaml')
        self.args = self.cfg
        args = self.cfg
        self.va_config = VolumetricAvatarConfig(args)
        self.weights = self.va_config.get_weights()

I create an issue around the lack of clarity for forward pass https://github.com/neeek2303/EMOPortraits/issues/27 I guess it's a side effect of not including the audio from model which probably glued everything together somehow.