How long before we can get to this level

Zejun-Yang / AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Apache License 2.0

4.64k stars 578 forks source link

How long before we can get to this level #126

Open gibsonhu123 opened 6 months ago

gibsonhu123 commented 6 months ago

https://www.microsoft.com/en-us/research/project/vasa-1/

MisterT96 commented 6 months ago

Just wait until they release new weights and then compare again.

gessyoo commented 6 months ago

https://www.microsoft.com/en-us/research/project/vasa-1/

I've made some videos that look better than those VASA-1 examples, but I'm also looking forward to the release of the pre-trained audio model, especially since M$ won't release vasa-1 for some time, if ever.

MisterT96 commented 6 months ago

https://www.microsoft.com/en-us/research/project/vasa-1/

I've made some videos that look better than those VASA-1 examples, but I'm also looking forward to the release of the pre-trained audio model, especially since M$ won't release vasa-1 for some time, if ever.

How and proof video would be nice. We are really interested in your progress.

gessyoo commented 6 months ago

https://www.microsoft.com/en-us/research/project/vasa-1/

I've made some videos that look better than those VASA-1 examples, but I'm also looking forward to the release of the pre-trained audio model, especially since M$ won't release vasa-1 for some time, if ever.

How and proof video would be nice. We are really interested in your progress.

I will post a few examples on social media, (https://www.instagram.com/p/C6Krt38ydtG/), and you are welcome to include them here as examples, as long as you attribute the source. The "how," at least until the audio model is released, is a matter of choosing an appropriate video and photo. The illusion breaks down if the head movement is too rapid or the angle is too extreme.

gibsonhu123 commented 6 months ago

https://www.microsoft.com/en-us/research/project/vasa-1/

I've made some videos that look better than those VASA-1 examples, but I'm also looking forward to the release of the pre-trained audio model, especially since M$ won't release vasa-1 for some time, if ever.

How and proof video would be nice. We are really interested in your progress.

I will post a few examples on social media, (https://www.instagram.com/reel/C6CdHDTOtFW/), and you are welcome to include them here as examples, as long as you attribute the source. The "how," at least until the audio model is released, is a matter of choosing an appropriate video and photo. The illusion breaks down if the head movement is too rapid or the angle is too extreme.

Could you provide a better video the instagram reel is partially cutoff

MisterT96 commented 6 months ago

https://www.microsoft.com/en-us/research/project/vasa-1/

I've made some videos that look better than those VASA-1 examples, but I'm also looking forward to the release of the pre-trained audio model, especially since M$ won't release vasa-1 for some time, if ever.

How and proof video would be nice. We are really interested in your progress.

I will post a few examples on social media, (https://www.instagram.com/reel/C6CdHDTOtFW/), and you are welcome to include them here as examples, as long as you attribute the source. The "how," at least until the audio model is released, is a matter of choosing an appropriate video and photo. The illusion breaks down if the head movement is too rapid or the angle is too extreme.

This is quite impressive, thank you very much!

gessyoo commented 6 months ago

https://www.microsoft.com/en-us/research/project/vasa-1/

I've made some videos that look better than those VASA-1 examples, but I'm also looking forward to the release of the pre-trained audio model, especially since M$ won't release vasa-1 for some time, if ever.

How and proof video would be nice. We are really interested in your progress.

I will post a few examples on social media, (https://www.instagram.com/p/C5UbGIuPh7G), and you are welcome to include them here as examples, as long as you attribute the source. The "how," at least until the audio model is released, is a matter of choosing an appropriate video and photo. The illusion breaks down if the head movement is too rapid or the angle is too extreme.

This is quite impressive, thank you very much!

Here's another, feel free to use it as an example if you want: https://www.youtube.com/shorts/lgnfAuh5wBY