face geometry strange result using another landmark model

milad-4274 commented 3 years ago

Hi and thank you for your great framework. I'm using another landmark detection algorithm which give me 68 2d facial landmarks. and I use this landmarks to generate 3448 3d landmarks using this library. the result is good. amber_eos

after, I generated canonical_mesh of metadata file from eos generated .obj file. (I'm almost sure about the procesdure correctness). I also used random weight in metadata file for testing purpose. after, I rendered a black image to face and the result is like: amber_effect

can you tell me where is the problem source? I'm suspect to landmark random weights or depth offset in geometry_pipeline. Any tips would be appreciated.

sgowroji commented 3 years ago

Hi @milad-4274, Could you please elaborate MediaPipe changes specific to your above use case. Thanks

milad-4274 commented 3 years ago

Yes Sure, Step 1 first, in face mesh example I updated FaceLandmarkFrontGpu node to use another landmark model in FaceLandmarkGpu. I also changed in dependencies such as FaceLandmarkLandmarksToRoi subgraph. the result of 68 2d landmarks has been shown below: amber_landmarks

Step 2 after testing landmarks, I wrote custom eos_calculator to use the 68 landmark and eos library to generate 3448 3d landamrks. the result of this step has shown in first image of first message.

Step 3 eos library allow us to get .obj output. here is screenshot of the .obj file gained from eos library: eos_obj

Step 4 in step 4, I wrote a script to convert .obj file to canonical mesh which you can find it here.

Step 5 in this step, I replaced canoncal_mesh which gained in step 4 to geometry_pipeline_metadata_landmarks file. I also used random landmark weights for all point.

after doing all steps when I want to render a complete black image, result is like second image in first message.

brucechou1983 commented 3 years ago

@milad-4274 I'm just curious, can eos generate 3dmm in realtime on mobile devices?

milad-4274 commented 3 years ago

@milad-4274 I'm just curious, can eos generate 3dmm in realtime on mobile devices?

The short aswer is Yes. my mobile phone is almost low level phone and the FPS in my mobile device is not bad. I'll inform you about exact eos_calculator latency. but the accuracy around lip and eye is not good. for gaining better accuracy, you need to use higher num_iterations and so the latency become higher.

milad-4274 commented 3 years ago

@brucechou1983 the average eos_caluclator latency in my phone is ~35 ms for 1 iteration.

kostyaby commented 3 years ago

Hey @milad-4274,

1) Can you please confirm that your OBJ file only contains triangular faces? If not, you'd have to triangulate it yourself beforehand or change your converter code like this 2) Could you please assign all landmarks weights to 1? It'll make sure there are no weird artifacts which are produced by a accident due to the specific random values you assigned

milad-4274 commented 3 years ago

Thank you @kostyaby I uploaded my .obj file here. it seems my .obj file has triangular face. I converted .obj file using your mentioned script, the result remains same.

when I changed all landmark weights to 1, I got strange result with generated canonical mesh by your script. this is the result: photo_2021-07-10_13-15-55

but the result with my canonical mesh remains same.

Due to the effect result, now I suspect to index buffers. what do you think?

kostyaby commented 3 years ago

when I changed all landmark weights to 1, I got strange result with generated canonical mesh by your script

That is interesting. Right now, I have 2 possible explanations why it might not work:

1) the runtime vertex IDs (produced by the EOS library) don't match the canonical vertex IDs from the OBJ. That would explain why changing weights also changed the output 2) the procrustes solver somehow gets numerically unstable with the kind data you have with the new EOS approach. That would also why changing weights also changed the output

Maybe there is another explanation that I don't see right now, but I have an idea of a test that you might run to distinguish 1) from 2):

In order to stretch a black texture on a picture, you don't need a valid pose_transform_matrix from the face_geometry.proto, only a valid mesh; the general idea is to skip using the FaceGeometryFromLandmarks and to "pack" the face geometry proto without triggering the procrustes solver path
As for pose_transform_matrix, please fill it with a 4x4 identity matrix
As for mesh, fill index_buffer and the UV component of vertex_buffer with what you have in the OBJ (you can just hard-code the whole thing into a MP C++ calculator file)
The XYZ component of vertex_buffer is a bit trickier, but is also doable in the following way:
- Set Z coordinate to some value between the near / far Z planes (the defaults are 1 and 10000, I'd pick something like 50 or 100)
- Now, you want to understand what's the size of the XY plane at the picked Z when observed by the face geometry perspective camera (essentially, how much space does the camera see along X and Y axis (in centimeters) exactly Z centimeters away from the virtual camera);
- The width and height of that XY plane (let's call them W and H respectively) can be computed by the following formulas: link (perspective_matrix[0] is your W, perspective_matrix[5] is your H)
- Now, all that's left to do is to re-scale the face landmark X and Y from the [0, 1] x [0, 1] ranges to [-W/2, W/2] x [-H/2, H/2] - those values will be correctly projected back into the screen

If after doing this you'll still see visual artifacts, then 1) is likely a problem - otherwise, it's either 2) or some other FaceGeometryFromLandmarks-related issue

Does this instruction set make sense so far? Is this something you can give a try?

milad-4274 commented 3 years ago

Thank you @kostyaby for your guidence. tell me if i'm wrong. you mean, I should not use FaceGeometryPipelineCalculator and write my own custom face_geometry_calculator (according to previous guidence) without using procrustes solver to check if problem exist or not.
if it's right, I'll do it and inform here about result in next few days. thank you again.

kostyaby commented 3 years ago

Yes, correct, my advice for debugging this is to not use FaceGeometryPipelineCalculator but rather implement a simpler version of the calculator to check if the procrustes solver is the problem

milad-4274 commented 3 years ago

The width and height of that XY plane (let's call them W and H respectively) can be computed by the following formulas: link (perspective_matrix[0] is your W, perspective_matrix[5] is your H)

Hey @kostyaby I tried to implement your mentioned procedure for debugging but I have a problem. I believe H and W depend on the camera distance accross the z axis, but in your mentioned formula, I did not see any input of z. I assumed that H and W are dimensionless and I multiplied them to my selected z(50 or 100) (is it right assumption?). but I did not see anything in output video. no black texture and also no Messed up black texture. the simple function that I wrote for estimating geometry provided here%20%7B%7D-,absl%3A%3AStatusOr%3Cstd%3A%3Avector%3CFaceGeometry%3E%3E%20EstimateFaceGeometry(,%7D,-private%3A).

I have another strange observation too. I wrote some logs to see the value of W and H, I observed that W is greater than H, which is against my expectation.

aspect ratio is 0.5625 w = 145.053, h= 81.5926 and f: 1.63185

kostyaby commented 3 years ago

Hey @milad-4274,

I believe H and W depend on the camera distance accross the z axis, but in your mentioned formula, I did not see any input of z.

Whoops, you're right, those formulas are not what you need. What I gave you was perspective camera formulas which convert the metric 3D space into the NDC. What you need are formulas within the metric 3D space so you can later project them with a perspective camera. In a way, they have to be inverse in respect to the camera formulas. Sorry for a misleading advice! H and W at depth Z should be smth like this:

float h = 2.f * Z * std::tan(kDegreesToRadians * env_camera.vertical_fov_degrees() / 2.f);
float w = h * aspect_ratio;

Please note, that here Z is positive (let's say, you want to project points Z = 100 = 100cm away from camera), but Face Geometry module operates in the right-handed coordinate system so when you assign .z coordinate to your mesh vertex it has to be going in the negative direction of the Z axis (from 0 to -Inf, not to +Inf) so you'd probably have to multiply it by -1 before writing.

I have another strange observation too. I wrote some logs to see the value of W and H, I observed that W is greater than H, which is against my expectation.

That's a good point, you shouldn't be observing this with the new formula

milad-4274 commented 3 years ago

@kostyaby Thank you for your great help. just for informing, I got almost correct answer with your guidence. the black texture renderred result represented in following image:

photo_2021-07-17_10-23-19

as you said before:

the runtime vertex IDs (produced by the EOS library) don't match the canonical vertex IDs from the OBJ. That would explain why changing weights also changed the output

the procrustes solver somehow gets numerically unstable with the kind data you have with the new EOS approach. That would also why changing weights also changed the output

If after doing this you'll still see visual artifacts, then 1) is likely a problem - otherwise, it's either 2) or some other FaceGeometryFromLandmarks-related issue

so the most likely problem is about numerical problem with procrustes solver. do you have any suggestion for starting point?

kostyaby commented 3 years ago

Thanks for validating this idea!

In addition to the "numerical instability of the pipeline" hypothesis, I'd like to add a hypothesis that there's something wrong with Z coordinate of the eos_calculator output (i.e. it doesn't follow the same conventions that are followed by the MP face landmark pipeline). I think it's a valid candidate too cuz you have literally overridden the Z coord in your latest test and the issue was fixes, which makes it possible that the Z coord is potentially the problem. Here's what I'd try next:

First, log the screen_landmarks to see what is it like (here is a good place). You can also compute & log some metrics: width / height / depth of the screen_landmarks (i.e. the difference between min and max coordinate along X / Y / Z), the mean coordinate value for X / Y / Z. For the MP face landmark pipeline, I'd expect width to be roughly similar to depth and also I'd expect mean_z to be around 0.
Then, log the first_iteration_scale to see what is it like (here is a good place). It should look sane (like, no NaNs or +/-Infs, no super-large numbers greater than ~10^5 or super-small numbers less than ~10^-5)
Then, log the intermediate_landmarks after the first fitting iteration to see what is it like (here is a good place). Again, it should look sane. It should also roughly be aligned with your canonical face model (you can even try computing & logging a MSE error between the two sets)
Then, log the pose_transform_mat to see what is it like (here is a good place). Remember, that this matrix is supposed to be the uniform scale + rotation + translation components. You can easily check whether the translation part (the last column of the matrix), you can also validate that the top-left 3x3 corner submatrix is a sane uniform scale + rotation matrix
Then, I'd log the metric_landmarks to see that is it like (here is a good place). It should roughly be aligned with your canonical face model (better than the intermediate_landmarks, also consider computing & logging a MSE error between the two sets)

google-ml-butler[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 3 years ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ai-edge / mediapipe

face geometry strange result using another landmark model #2253