KwaiVGI / LivePortrait

Bring portraits to life!
https://liveportrait.github.io
Other
12.85k stars 1.36k forks source link

High-quality zero-shot lipsync pipeline built on LivePortrait #400

Open mvoodarla opened 1 month ago

mvoodarla commented 1 month ago

Hey folks! My team has been exploring zero-shot lipsyncing for a bit and we think we've improved on MuseTalk's quality quite a bit by using LivePortrait to neutralize expression and CodeFormer to enhance. Here's an example.

https://github.com/user-attachments/assets/cfabcd9f-92e0-4c52-b786-77fc63eef81b

We wrote a technical blog on it: https://www.sievedata.com/blog/sievesync-zero-shot-lipsync-api-developers

Hope to put out an OSS repo soon too :)

Anything we don't talk about in the blog that we should in our repo release?

ziyaad30 commented 1 month ago

No Codeformer, no Stable diffusion, just Audio2Head and LivePortrait, so you wanna attach a price to this Open source software now?

This actually took me 6.5 minutes

https://github.com/user-attachments/assets/b702c7da-1e1f-412e-989d-0202032c45ac

ziyaad30 commented 1 month ago

Just another example of FREE

https://github.com/user-attachments/assets/68c3f1f4-dad8-4c71-8d25-0d4f091361b1

cleardusk commented 1 month ago

It seems a good practical mix of MuseTalk and LivePortrait 👍 @mvoodarla Will it be open-sourced lately?

mvoodarla commented 1 month ago

Hey @ziyaad30, those generations look nice! While we plan to open source relevant parts of our code, the full system is tailored to our infrastructure and wouldn't be directly usable by most developers. Our blog details the steps to achieve this quality for those interested in replicating it.

We charge for the service to cover the significant GPU costs for inference with large Stable Diffusion models. Our pay-per-use model is more accessible than the upfront cost of purchasing hardware. We're committed to open sourcing more as we develop with open source models, but some costs will always remain due to GPU requirements.

We plan to release an OSS repo soon (see the bottom of the blog for details!).

ziyaad30 commented 1 month ago

Hey @ziyaad30, those generations look nice! While we plan to open source relevant parts of our code, the full system is tailored to our infrastructure and wouldn't be directly usable by most developers. Our blog details the steps to achieve this quality for those interested in replicating it.

We charge for the service to cover the significant GPU costs for inference with large Stable Diffusion models. Our pay-per-use model is more accessible than the upfront cost of purchasing hardware. We're committed to open sourcing more as we develop with open source models, but some costs will always remain due to GPU requirements.

We plan to release an OSS repo soon (see the bottom of the blog for details!).

If you can do that and release it, so those who have the GPU/Power to run it, and who do not have access to pay, then it'll be very good.

mvoodarla commented 1 month ago

here it is! https://github.com/sieve-community/sievesync

Anishannu commented 1 month ago

Etti _singh

Anishannu commented 1 month ago

Etti _singh

Fantadana commented 1 month ago

here it is! https://github.com/sieve-community/sievesync

@mvoodarla Thanks! Your model's performance is quite good. It seems that your model's main framework is still MuseTalk, I'm curious about how much impact the retargeting module has on the results. Could you provide some examples w/ and w/o retargeting to illustrate the difference?