Closed Jim61C closed 6 years ago
The positioning has to be consistent since if it moves about in the Z component it will just seem like noise during training. If I remember correctly on this particular model, the Z placement is done by the mean of the original mesh. I've also done this based on the point closest to the camera (usually the tip of the nose in frontal images, for example). I'm not sure which works best, and I am not completely sure (off the top of my head) which way I did it for this particular model.
The output of the network is not thresheld, at least in the MATLAB version, where I just let MATLAB decide. If you do threshold it, the output will be more blocky (with most marching cube like methods) since it cannot smooth the isosurface based on the voxel intensity.
If you are computing error for a publication, I would appreciate it if the comparison was done against the MATLAB implementation. The Python implementation uses quite a bad isosurface function.
Thank you very much for the clarification! I see! For the thresholding, yes, no thresholding is done on the iso-surface calculation to avoid the blocky effect as mentioned by you exactly. What I meant is for the voxelisation process, as I believe the current model cuts off the 'ear' part, so I am wondering how the 3DMM meshes are being cut to generate the voxelisation result for training. Thank you!
Ah, I see. We manually selected the frontal region of a frontal face, in advance, and then used those saved indices for each face we voxelised.
Cool! Thanks! I am wondering would you mind providing those indices? or some guideline on how to get those, I am seeing the face contour indexes in the 3DDFA code base but I am not sure how to derive the frontal face indexes out of that contour. Thanks!
Another question is that it seems that the ground truth voxel volume used for supervision is actually the mesh with z component scaled by 2, which results in a 'warped' shape of the face during learning, I am wondering is there a particular reason for doing this instead of learning the unscaled shape? Thank you!
Hey, the vertices can be found here: http://cs.nott.ac.uk/~psxasj/download.php?file=vrn-3ddfa-vertexfilter
The original purpose of scaling the Z component by to try and improve detail. I'm not actually sure how much it helped as we never did any quantitative analysis of this. The code provided scales the Z component by 0.5 if I remember correctly.
Great thanks for the vertex indexes!
I see. Yep, during the iso-surface calculation, it scales the raw volume back by multiply 0.5, which means the original raw is scaled by 2. I was wondering if there is a principal / thumb of rule behind the reason of this.
Thanks.
Hi!
Thank you for making the feed forwarding code available! I just wish to see how the voxelisation result is placed in the cube. I understand that x-y of the voxel volume will align with the image 192x192, however, as orthographic camera model is used. What is the convention you used to put the volume in z coordinate? are you putting them such that the volume will always be centered in z. Also, I am wondering may I know how much of the model are you cutting at the back? is there a particular threshold used, Thank you very much!