georgeretsi / smirk

Official Pytorch Implementation of SMIRK: 3D Facial Expressions through Analysis-by-Neural-Synthesis (CVPR 2024)
https://georgeretsi.github.io/smirk/
MIT License
185 stars 21 forks source link

3D Face Tracking General Question #13

Closed emlcpfx closed 5 months ago

emlcpfx commented 5 months ago

Hi, I'm looking forward to giving this a test drive. This is a general question about 3DMM's and face tracking. Having read this board, It sounds like smirk won't yet produce smooth results on video input. Outside of EMOCAv2 and MICA, are you aware of any repo's that have pushed that work further?

I've seen FlawlessAI's new paper improving results on 2D and 3D landmarks, but that's not publicly available code.

filby89 commented 5 months ago

Hey, thanks for your interest in SMIRK :) In my opinion the results are smooth enough to be applied on video as well - you can see that on our accompanying video https://www.youtube.com/watch?v=8ZVgr41wxbk. However, we have not tested thoroughly the video results. Another repo that includes temporal modeling with temporal convolutions is our previous work SPECTRE: https://github.com/filby89/spectre.

emlcpfx commented 5 months ago

Thanks for the reply. Check out the tool we made from EMOCA v2 — https://m.youtube.com/watch?v=uSxgt5mcbXM

I’m always on the lookout for ways to improve this workflow. Most of the development on fitting 3DMM’s seems to have moved behind closed doors to the private sector, which bums me out.

filby89 commented 5 months ago

Thanks for your demo ! Seems pretty nice work :)

emlcpfx commented 3 months ago

I'm back to say that I did some side-by-side tests of SMIRK vs EMOCA, and SMIRK is much more stable. The eye alignment blows EMOCA out of the water. Really fantastic work!

filby89 commented 3 months ago

Thank you very much for your kind words :D It would be nice if you could also upload one or two videos of the comparisons you made (if possible).

emlcpfx commented 3 months ago

Hey @filby89 check out this comparison video: https://www.youtube.com/watch?v=XOpSaifwRpA

The eye alignment is better on SMIRK, and it looks more stable to me. There's one video there where EMOCA 'wins,' but from my other testing, SMIRK is able to track in more situations. EMOCA has a problem in the eye alignment in general, due to the training data (if I remember right,) and the eyes are always too low.

jimydavis commented 3 months ago

@emlcpfx that video looks really great. Did you by any chance tried to do the same qualitative comparison against MICA / metrical-tracker on the quality? My understanding of face2face and metrical-tracker is that they should produce the highest quality since they are training / doing gradient descent on one specific video (cannot be generalized to any other video). The training also includes temporal consistency if I am not wrong.

emlcpfx commented 3 months ago

Yes, but MICA takes too long to process.

There were a couple other workflow reasons why EMOCA was the better choice, but I can’t remember.

The quality wasn’t worth the 10x compute time for what we’re doing.

filby89 commented 3 months ago

Hey @filby89 check out this comparison video: https://www.youtube.com/watch?v=XOpSaifwRpA

The eye alignment is better on SMIRK, and it looks more stable to me. There's one video there where EMOCA 'wins,' but from my other testing, SMIRK is able to track in more situations. EMOCA has a problem in the eye alignment in general, due to the training data (if I remember right,) and the eyes are always too low.

thanks a lot for the video it's a very nice comparison !