the pretrained checkpoint is 256, can you release a 512x512 pretrained version. thank you.

ali-vilab / dreamtalk

Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models

https://dreamtalk-project.github.io/

MIT License

1.62k stars 199 forks source link

the pretrained checkpoint is 256, can you release a 512x512 pretrained version. thank you. #6

Open zhangziliang04 opened 10 months ago

zhangziliang04 commented 10 months ago

the pretrained checkpoint is 256, can you release a 512x512 pretrained version. thank you.

YifengMa9 commented 10 months ago

Thanks for your attention. We didn't anticipate such a high demand for 512 resolution 😂. We are considering adding support for 512 resolution in the near future. We plan to improve the resolution by using common super-resolution models, such as the temporal GFPGAN proposed in MetaPortrait, in post-processing.

Inferencer commented 10 months ago

none of them are temporal tbh, codeformer with a high weight is pretty good as it makes the skin less cartoonish compared to gfpgan but it comes down to speed too and I havent compared them on that

Edit: oh i see what you are referring too that's very intresting