emilianavt / VSeeFaceSDK

The VSeeFace SDK for Unity is used to export models in the VSFAvatar format.
149 stars 10 forks source link

Using only lip sync visemes without hybrid with iphone tracking when spoken #19

Open katrinshtorm opened 2 years ago

katrinshtorm commented 2 years ago

I have lip sync visemes AEIU and I want to enable them when I have iphone tracking sync enabled. Only them without hybrid with iphone tracking. I've tried different variations but still couldn't set it up. If I disable " Receive facial features" then the mouth uses lip sync visemes but the top part of the face stops working. When "Only open mouth according to one source" is enabled then blending happens between "lip sync visemes" and " Iphone tracking lip sync". Is there a way to use only " lip sync visemes AEIU" with receiving top facial features from iphone? If not is there any plans to change of it in the future? I've recorded a video

https://youtu.be/1dbLO3CpYIA

emilianavt commented 2 years ago

Currently the following options are possible:

Disabling mouth animation through ARKit tracking is not supported, since that is one of the main features people are usually looking for with ARKit tracking. You could try modifying your model's blendshapes and remove the ARKit ones affecting the mouth. As long as other ARKit blendshapes are present, this should disable mouth tracking from the iPhone and still allow lip sync to operate.

The "Only open mouth according to one source" option works like hybrid lip sync, but there is no smooth transition between lipsync and mouth tracking blendshapes, which can cause the mouth to "snap" into place.

katrinshtorm commented 2 years ago

I've been looking for ability to use mouth movements from iphone tracking, but when words/sounds are spoken the switch happen only to lip sync visemes without blending them with forms from iphone. Because with iphone blendshapes it's hard to make specific form of mouth then with aoei when speaking. When I speak using aouie visemes, the mouth switches between them, when I speak using ios blendshapes, the mouth blends between blendshapes and there is no quick switching between forms. If I create different mouth shapes (e.g. triangular/cat mouth or small lips), when using Aoei visemes they switch nicely, while when using ios pronunciation they blend ugly.

emilianavt commented 2 years ago

Hybrid lipsync with "only open mouth according to one source" enabled will only use the audio lipsync while spoken audio is detected. Having looked again at the video however, there is something very strange going on. At 00:08, the face should not be able to move at all. There might be a bug, I will have to look into it.

katrinshtorm commented 2 years ago

No, everything seems to be in order there, I have such a mouth shape for viseme E. The problem is that when I use ios tracking, the mouth moves well, but the pronunciation lacks different mouth shapes to express it. For example, if we take anime, the mouth of the characters moves quickly through the frames and has different shapes to which it switches, the same for 2D Vtubers. It can be reproduced by using aouie visemes. However, when using ios tracking for pronunciation, the pronunciation lacks expressive mouth shapes. If they are made the same as aouie visemes, then the pronunciation of the mouth will not switch smoothly like with aouie visemes, but will be ugly combined between forms (especially if they differ greatly). So I think it's ideal to use ios mouth tracking, however, when pronouncing words, be able to use only aouie visemes without combining them with ios blendshapes (because when combined they look bad). I also made a video where I read one text using first with aouie viseme, and then with ios tracking. There you can see the problem, the mouth lacks expressiveness with ios pronunciation (in contrast to aouie visemes).

https://www.youtube.com/watch?v=m4vktFves8g

emilianavt commented 2 years ago

So I think it's ideal to use ios mouth tracking, however, when pronouncing words, be able to use only aouie visemes without combining them with ios blendshapes (because when combined they look bad).

This is exactly what enabling hybrid lipsync and "only open mouth according to one source" should do.

katrinshtorm commented 2 years ago

Then we have a problem here, because it doesn’t work. Here is video (~1min) about me showing mouth movements with ios tracking/visemes and enabled option "only open mouth according to one source". And it doesn’t work they way it’s supposed to. Instead of switching between visemes, combining of visemes with iphone tracking happens when I speak https://youtu.be/1y3z6Rx3Aig

emilianavt commented 2 years ago

Yes, that's why I said that there seems to be a bug that I need to look into.

katrinshtorm commented 2 years ago

ok, thank you