KelianB / SPARK

Official implementation for the SIGGRAPH Asia 2024 paper SPARK: Self-supervised Personalized Real-time Monocular Face Capture
260 stars 6 forks source link

Questions about some details. #3

Closed chenerg closed 3 weeks ago

chenerg commented 3 weeks ago

In Figure 2, it is observed that the neutral template is optimized during training. In your paper, it is also stated as follows:

D is initialized in a separate supervised pre-training stage that minimizes the squared norm ||D(𝛾(𝑥)) − E||2 at the positions of the canonical FLAME vertices, such that the network initially mimics the FLAME expression basis E.

Therefore, are the blendshapes for the current identity also optimized? Is this statement true?

KelianB commented 3 weeks ago

Hi, The citation only refers to the expression blendshapes. For the identity blendshapes, we use the mean prediction from the initial pre-processing of the sequences (during which we also get our pose and expression coefficients for each frame). We then optimize vertex offsets on top of the neutral FLAME template with identity shapes.

chenerg commented 3 weeks ago

That's clear, thanks!