OpenTalker / video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
https://opentalker.github.io/video-retalking/
Apache License 2.0
6.04k stars 896 forks source link

Some idea discussions on saving time in operations. #159

Open Miss0x opened 8 months ago

Miss0x commented 8 months ago

Suppose what we want to achieve is to match the fixed video with different audio, so that the characters only have facial expressions and lip changes.

So can we split the various steps of the whole process, for example: pre-generate the frame collection corresponding to the video, and preprocess and save the face landmark for each frame image. Then if only the audio is changed, the whole process should be simplified to only the last part, namely: lip synchronization, image high definition and frame recombination into video.

I expect that if such a scheme is feasible, for such a specific scenario, each job can save at least 25% of the time.

These are some of my personal ideas. I don't know if they are feasible. I need more code support from the participants and users of this project.

Thanks a lot.

kunncheng commented 8 months ago

Thanks for your suggestion. This feature has been implemented since the release of the project.

The same input video will only be repreprocessed if you specify the --re_preprocess argument.

AIhasArrived commented 8 months ago

Thanks for your suggestion. This feature has been implemented since the release of the project.

The same input video will only be repreprocessed if you specify the --re_preprocess argument.

Hello @kunncheng I am happy to see you are here (4 days ago) this gives me hope that you will see my issue about RAM usage vs GPU usage (which does not seem ot be used at all). Can't wait for your answer/and or solution.