AaronJackson / vrn

:man: Code for "Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression"
http://aaronsplace.co.uk/papers/jackson2017recon/
MIT License
4.52k stars 747 forks source link

ICP on florence face dataset #144

Closed ydwen closed 4 years ago

ydwen commented 4 years ago

Hi Aaron,

I have a question in the evaluation on florence face dataset. When we compute the NME of two point clouds (prediction and target), they usually are in different scale. Specifically, the predicted point cloud is with the range of image (e.g. [1, 256]), while the target point cloud is not. However, the ICP function we used is the pcregrigid in matlab. It does not consider the scale between the moving and fixed point clouds. In this case, how do we compute NME if the scales of two point clouds are different? Should we include scale in ICP? Or do you have other solution on it?

Thanks for your patience.

AaronJackson commented 4 years ago

If the input image has been scaled by VRN then the output will need to be scale back to the original by applying the inverse transform. There should not be any need to apply any manual scaling to bring things back into alignment. ICP should only need to be used for finding the correspondence, once you have that the NME can be calculated

ydwen commented 4 years ago

Sorry for the confusion.

My actual question is how to associate the xyz coordinates provided by the obj file, and those in the rendered face image. Aren't they in different scales? Is it addressed by ICP?

Thanks.

AaronJackson commented 4 years ago

The input image is scaled using a landmark detector, so the vertices may have a different scale to the original input image. The inverse of this can be applied to go back to the original image scale. To figure out what this scale is, you'll need to go through the code and see how it is calculated.

I don't really understand your question to be honest, sorry

ydwen commented 4 years ago

Really sorry for my bad description. But we are getting close to it.

We have two things provided by the florenceface dataset. a. The original images. (or rendered images?) b. The xyz coordinates of the vertices.

From the original image, VRN predicts xyz coordinates. They are in different scales, right?

AaronJackson commented 4 years ago

Like I said, some scale is applied to normalise the face to an expected size. You'll need to apply the inverse of this scaling to go back to the original size. This scaling is calculated and performed lines 43-62 of run.m. Apply the inverse of these operations in the correct order to the predicted mesh to go back to your original scale. You do not need to use ICP to find the scale.

Yandong Wen writes:

Really sorry for my bad description. But we are getting close to it.

We have two things provided by the florenceface dataset. a. The original images. (or rendered images?) b. The xyz coordinates of the vertices.

From the original image, VRN predicts xyz coordinates. They are in different scales, right?

ydwen commented 4 years ago

Got it. Thanks!

HOMGH commented 3 years ago

Hi Yandong and Aaron, @ydwen @AaronJackson I had a question regarding computing NME for 3D face reconstruction. When the output 3D mesh of an algorithm has for example 43k vertices (for the case of PRNet for example), how can we compare this with the groundtruth mesh which has for example 53k vertices(for the case of AFLW2000 dataset). Would you please share the NME computing code you used based on above discussion? Thank you.

AaronJackson commented 3 years ago

Hi

I used pcregrigid in Matlab to find the correspondence between the two meshes without applying a transformation. This required a modification to the built-in Matlab function, which unfortunately I cannot share, as it's copyrighted by MathWorks. To make matters worse, the newer versions of Matlab only include a compiled version of pcregrigid, and as such, it is not possible to make the same modification.

The modification was essentially to also return the indices used to return the transformed mesh. Your best bet is to find a ICP implementation which also returns the correspondence.

Sorry about that! Aaron

Pingu writes:

Hi Yandong and Aaron, @ydwen @AaronJackson I had a question regarding computing NME for 3D face reconstruction. When the output 3D mesh of an algorithm has for example 43k vertices (for the case of PRNet for example), how can we compare this with the groundtruth mesh which has for example 53k vertices(for the case of AFLW2000 dataset). Would you please share the NME computing code you used based on above discussion? Thank you.