Rubikplayer / flame-fitting

Example code for the FLAME 3D head model. The code demonstrates how to sample 3D heads from the model, fit the model to 3D keypoints and 3D scans.
http://flame.is.tue.mpg.de/
714 stars 108 forks source link

Bad model performance (fitting) #50

Open Stefano-retinize opened 7 months ago

Stefano-retinize commented 7 months ago

First of all, thank you for this amazing work.

I am experiencing bad performance fitting results. I want to do retargeting. To do so I am aiming to obtain the shape flame parameters and I was hoping using this repository to have a very close head shape. Since I want to build a new head that looks very similar to a head in neutral position, I took out of the chumpy optimization all the parameters of expressions and pose so I can optimize just the shape parameters forcing the others to be in zero position:

free_variables = [model.trans, model.betas[:300]] #instead of free_variables = [model.trans, model.pose, model.betas[used_idx] #
ch.minimize( fun      = objectives,
                 x0       = free_variables,
                 method   = 'dogleg',
                 callback = on_step,
                 options  = opt_options )

but the results that I am getting are very bad: Objective: https://github.com/Rubikplayer/flame-fitting/assets/136068442/c2410bf1-6fdf-47e7-9d22-df5a08973eb5

Fitting Result: https://github.com/Rubikplayer/flame-fitting/assets/136068442/fd74f868-996b-4976-ade3-afd653fbd1ee

Would you say they are the same person? image

Am I doing something wrong? is this the expecting quality of this approach? Thanks

TimoBolkart commented 7 months ago

In this fitting scenario, the model is trying to explain the entire head including hair cap etc. of the target mesh. This can hinder the overall fitting performance, as the number of identity shape parameters (i.e., 300 in this example) might be too limited to explain the entire head better.

For a better fitting result, in case you are using some regularizer on the identity shape parameters, you can try reducing the weight of this regularizer. Further, to increase the fitting quality in more important regions such as the face, consider adding some per-vertex weight to the vertex-to-vertex loss that increases the weight in the face region, while lowering the weight in other regions such as the back of the head, the neck, etc.

Stefano-retinize commented 7 months ago

Thanks @TimoBolkart for your answer.

Since the optimization problem is not linear, the algorithm behind the chumpy library is an heuristic. And therefore it won't arrive to global minimums. That said. Is this the only way to find the FLAME parameters of an specific shape? What other choices do we have given a shape, to find the FLAME parameters?

Thanks.

TimoBolkart commented 7 months ago

While this is true, the problem in this case is likely not an optimization problem but a problem of optimizing with too many constraints. In that particular example it seems you are fitting a VOCA or CoMA template mesh, is that correct? These templates are in canonical space, which means one does not need to optimize for pose and expression, which makes the actual optimization problem linear. In that specific case, instead of optimizing for the FLAME identity parameters, one could solve it by inverting the linear identity transformation as the identity shape basis is a scaled orthogonal matrix.

Stefano-retinize commented 7 months ago

It is indeed a VOCA or CoMA template mesh and I also verified that it seems to be a FLAME mesh since the vertex are in the exact same order. (I didn't want to constrain the problem to just these type of meshes but is still useful). Could you provide more information about how to solve this? What do you mean by "inverting the linear identity transformation"?

Thank you.

TimoBolkart commented 7 months ago

The FLAME shapedirs is a tensor of shape 5023 x 3 x 400, where the first 300 basis vectors are for identity, and the remaining 100 are for expression. In that specific example for fitting a registered unposed FLAME mesh (i.e., in FLAME's canonical space), you can take the identity shape basis (the 5023 x 3 x 300 subset of the shapedirs), reshape it to 50233 x 300 matrix S, compute the norm along each of the 300 columns and factor these scales F from the basis (S = S_normalized x D_scale where D_scale is the diagonal matrix of per-column scales). This you can analytically invert as S_normalized is an orthogonal matrix. So if you have an unposed set of vertices X (in 50233), you can solve for the FLAME parameters as (id_params = D_scale^{-1} x S_normalized^{T} x X).

Stefano-retinize commented 7 months ago

Could you confirm that the shapes of the matrices would be: S: 15069 x 300 S_normalized: 15069 x 300 D_sacale: 300 x 300

and D_scale is a matrix where all the values outside the diagonal are 0? Thanks