Add video-driven pipeline script for portrait animation

liutaocode commented 2 weeks ago

Description

This PR adds a new script run_video_driven_pipeline.py that provides a command-line interface for animating portrait images using driving videos.
This PR adds Baidu Cloud Storage download links for model weights and dependencies, making it more accessible for users in China where downloading from international sources might be slow or unstable.

johndpope commented 2 weeks ago

on new versions of PIL

its complaining - Traceback (most recent call last): File "/media/oem/12TB/EMOPortraits/pipeline.py", line 220, in source_img = to_512(Image.open(args.source_image_path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/oem/12TB/EMOPortraits/pipeline.py", line 23, in to_512 = lambda x: x.resize((512, 512), Image.ANTIALIAS) ^^^^^^^^^^^^^^^ AttributeError: module 'PIL.Image' has no attribute 'ANTIALIAS'

find and replace ANTIALIAS -> LANCZOS fixes things

pip install pillow -U to recreate error.

The performance of this model seems very good. Did you do fps benchmark?

liutaocode commented 2 weeks ago

on new versions of PIL

its complaining - Traceback (most recent call last): File "/media/oem/12TB/EMOPortraits/pipeline.py", line 220, in source_img = to_512(Image.open(args.source_image_path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/oem/12TB/EMOPortraits/pipeline.py", line 23, in to_512 = lambda x: x.resize((512, 512), Image.ANTIALIAS) ^^^^^^^^^^^^^^^ AttributeError: module 'PIL.Image' has no attribute 'ANTIALIAS'

find and replace ANTIALIAS -> LANCZOS fixes things

pip install pillow -U to recreate error.

The performance of this model seems very good. Did you do fps benchmark?

@johndpope

Hello, I didn't encounter this error, and I directly used the Conda environment from the readme provided by the author. Besides, I haven't conducted the FPS benchmark, but from my experience running the model, the speed isn't particularly fast. Without optimizations, I guess it cannot achieve real-time performance directly. I estimate it's running at just a few FPS on my RTX 3090 hardware.

johndpope commented 2 weeks ago

fyi - there's a modnet instance already inside the inference wrapper -

it maybe the inference forward pass is already calling the mask (duplicating the ibug usage) source_mask_modnet = self.get_mask(source_img_crop)

it would be a step backwards if this is so crippled in performance compared to megaportraits.

UPDATE I see the infer pass ignores this - from my testing im seeing 9-10 fps.

fyi - I untangle args in the va model

this project is passing so many args it reminds me of this guy on tiktok when he starts crying... https://www.tiktok.com/@jeremy.jey/video/7395935491478015265?lang=en

args -> args ->args ->args ->args

https://gist.github.com/johndpope/3dda5ff978541564d78b7ef26a4c3661


from omegaconf import OmegaConf
class Model(nn.Module):

    def __init__(self, DELETETOOMANYARGUMENTS, training=True, rank=0, exp_dir=None):
        super(Model, self).__init__()

        self.exp_dir = exp_dir

        self.cfg = OmegaConf.load('./models/stage_1/volumetric_avatar/va.yaml')
        self.args = self.cfg
        args = self.cfg
        self.va_config = VolumetricAvatarConfig(args)
        self.weights = self.va_config.get_weights()

I create an issue around the lack of clarity for forward pass https://github.com/neeek2303/EMOPortraits/issues/27 I guess it's a side effect of not including the audio from model which probably glued everything together somehow.

neeek2303 / EMOPortraits

Add video-driven pipeline script for portrait animation #22

Description