yfeng95 / PRNet

Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network (ECCV 2018)
http://openaccess.thecvf.com/content_ECCV_2018/papers/Yao_Feng_Joint_3D_Face_ECCV_2018_paper.pdf
MIT License
4.95k stars 947 forks source link

A problem about the keypoints output #55

Open ghost opened 6 years ago

ghost commented 6 years ago

Hello! Thanks for the code! I am a little confused about the the meaning of the output format. In fact, I am not clear about which part these keypoints represent. Thank you!

ghost commented 6 years ago

Hello! I came across another question. I have found that when saving 3d face, the shape of save_vertices is (43867,3),what does this mean? Is this mean that x,y,z of 43867 points? If so, I am wondering which points represent the corner of eyee. Thank you!

wungemach commented 6 years ago

This means that the output facial mesh has 43867 vertices in it. More specifically, when plotting the obj file representing the predicted 3d facial mesh, a subset of 43867 pixels in the output 256 X 256 image are chosen to represent vertices. This pixel --> vertex correspondence is given by the face_ind.txt file.

As for the corner of the eye, the uv_kpt_ind.txt file gives the x,y pixel coordinates corresponding to 68 "key point" vertices in the predicted mesh; 12 of these give locations around the eyes. It should be pretty easy to just plot the vertices in this uv_kpt_ind.txt file and see exactly which ones these are.

Hope this helps!

franciszzj commented 6 years ago

@wungemach The 3DMM output mesh has 53215 vertices. Do you know how to use PRNet to generate 53215 vertices mesh? Where can I find/generate the corresponding index? The index given by the author can generate only 43867 vertices.

wungemach commented 6 years ago

@Franciszzj As far as I can tell, there is not a good way to do this. For a couple of reasons.

(1) PRNet doesn't predict the neck region, which accounts for many of the vertices in the difference between the 53215 and 43867, so you can't hope to use PRNet to predict the vertices in this region.

(2) You can't just use the BFM_UV.mat uv-map for the Basel Face Model to pass from a [255,256,3] position map (the output of PRNet) to a mesh for the following reason. The Basel Face Model uv-map is given with floating point coordinates in [0,1]. Many of these vertices map to the same point in the [256,256] array when you naively just scale up this uv-map by 256; points that are too close together get rounding to the same location in the array after scaling. Thus, multiple vertices in the mesh map down to the same location in the array, so there is no hope of inverting this map. I do not know how the authors came up with their map to construct the 43867 vertex mesh. Perhaps they had a smart way to resolve these collisions.

I suspect that you can't get any significant increase in the size of the output mesh of PRNet as the 43867 vertex mesh that they predict as a 1-to-1 vertex-pixel correspondence inside of the non-zero region of the weight mask, i.e. they are already using all of the meaningful information put out by the network.

One thing that you could try to do is just train your own PRNet which also tries to predict the neck region from the data in 300W_LP. I have tried to do this with limited success. At first glance it seems like this shouldn't really make the task any harder (in fact maybe it seems easier as the network outputting more reasonable predictions in the neck area), but I think that the point is that learning neck information from the images in 300W_LP is actually just a pretty hard task and the network has a hard time with it. This is just speculation though, maybe someone else has a smarter way to train the network that also predicts that region.

Hope this helps!

franciszzj commented 6 years ago

@wungemach Thanks for your replay. I will reconsider this problem.

LogicHolmes commented 6 years ago

@wungemach the weight mask that I don't know how to distinguish between neck and face areas. Do you have any suggestion?

wungemach commented 6 years ago

@LogicHolmes I had some trouble with this at first too. You can run the test file they provide to predict the target uv-map without the neck area. (It did this automatically for me with the trump jpg.) I was able to then just look at which pixels have nonzero values to extract the indices for the chin line and combine it with their provided weight_mask.png

Hope this helps!