Open YokkaBear opened 4 years ago
Hi!
Evaluation metrics can computed only when you have ground truth depth maps, such as in the provided synthetic face dataset. After running the test code with run_test: true
and load_gt_depth: true
specified in the *.yml
config file, you should see a text file named eval_scores.txt
in the result folder, which contains the scores.
Thank you so much. Excuse that I have another question about the ground truth data, that is how you generate the depth map images for ground truth (as shown in the picture attached). E.g. If I want to check the evaluation metrics of the model trained on cat face dataset or others, it would be necessary to obtain the depth map images as ground truth. Thank you.
We used a synthetic face model to obtain ground truth depth maps for evaluation. We do not have cat face datasets with ground truth depth maps, and cannot evaluate the predicted depth maps directly. For human faces, there are datasets with ground truth 3D scans such as NoW Dataset so it is possible to perform a direct evaluation on the depth predictions.
Thank you for your reply. Following your guidance, I have browsed some 3D human dataset like NoW Dataset and Bosphorus, but I noticed that the ground truth data of them are mostly given in the format of obj/off files representing the 3D scans, but not in depth map images. So I wonder how you convert the 3D scans in obj/off format into depth maps in png/jpg format as an image. Thank you.
Cool! There are various ways of converting depth maps to meshes, which can be stored as an .obj file. We have a piece of code in the demo, which does this in a fairly naive way: https://github.com/elliottwu/unsup3d/blob/30f4550b6bab6a520e9dd005dadac637b2fb9eb6/demo/demo.py#L182.
Hi @elliottwu , Now I am trying to use another human face dataset to train your unsup3d model, but before training, I noticed that the images in my dataset are all un-cropped, i.e. the background data are also encompassed. So I wonder whether I should apply some face-cropping method to crop the faces and make up a new face-cropped dataset to train the model, or just use the original un-cropped dataset? Hope your reply, thank you.
You should crop the images for better results. The MTCNN face detector (facenet) is a good option. The demo code provides a cropping scheme: https://github.com/elliottwu/unsup3d/blob/30f4550b6bab6a520e9dd005dadac637b2fb9eb6/demo/demo.py#L110.
@elliottwu big thanks to your demo scripts, the cropping effect is much greater than the methods searched by my own. 👍
HI, @elliottwu , when I tried to use unsup3d model to another dataset, I encountered the following situation:
I have trained unsup3d on the dataset for 30 epochs before, but when I tried to use the well-trained model to predict 3D human face from the input human face images, I got the following intermediate results:
00001_canonical_albedo.png
00001_canonical_image.png
00001_recon_image_flip.png
00001_recon_image.png
It seems that the reconstruction face retains only half of the original face but loses the other half, and the canonical face then flips that only half of face, leading to a weird result (just a guess, can't make it exact).
In this case, what should I do to make the model output a better and a correct result? Maybe relevant to the quality of dataset? Or maybe relevant to the parameters of the dataloader, like 'image_size' or 'crop'?
Really hope to hear your points and advice on my problem, thank you!
Cool! There are various ways of converting depth maps to meshes, which can be stored as an .obj file. We have a piece of code in the demo, which does this in a fairly naive way:
https://github.com/elliottwu/unsup3d/blob/30f4550b6bab6a520e9dd005dadac637b2fb9eb6/demo/demo.py#L182 .
Hi, @elliottwu , I was trying to recover 3D mesh (.obj) from BFM GT-depth using your demo code, but I met some trouble in implementing this process. Would you (or anyone else) like to share the code on how to recover 3D mesh (.obj) from those GT-depth? Thanks a lot!
**** update ****
Here is my code to achieve depth-to-3d recovery, however when I input a ground truth depth image as Fig. 1 with the dimension (256,256,3), I got a strange 3d mesh result (Fig. 2). Could anyone tell me how to get a correct recovered 3d mesh? Really urgent, and greatest thanks!!!
Fig. 1
Fig. 2
import argparse
import numpy as np
from PIL import Image
import torch
import math
import torch.nn as nn
from utils import * # gaidong 去掉了。
import cv2
import pdb
EPS = 1e-7
use_gpu = True # default
device = 'cuda:1' if use_gpu else 'cpu' # origin: cuda:1
image_size = 256 # gaidong input resize; origin: 64
min_depth = 0.9
max_depth = 1.1
border_depth = 0.7*max_depth + 0.3*min_depth
fov = 10 # in degrees
save_dir = '/root/3dface/unsup3d_modified/demo/images/depth_test/results'
depth_rescaler = lambda d : (1+d)/2 *max_depth + (1-d)/2 *min_depth # (-1,1) => (min_depth,max_depth)
fx = (image_size-1)/2/(np.tan(fov/2 * np.pi/180))
fy = (image_size-1)/2/(np.tan(fov/2 * np.pi/180))
cx = (image_size-1)/2
cy = (image_size-1)/2
K = [[fx, 0., cx],
[0., fy, cy],
[0., 0., 1.]] # camera parameter
K = torch.FloatTensor(K).to(device)
inv_K = torch.inverse(K).unsqueeze(0) # got inv_K
K = K.unsqueeze(0)
def depth_to_3d_grid(depth, inv_K=None):
if inv_K is None:
inv_K = inv_K
b, h, w = depth.shape
grid_2d = get_grid(b, h, w, normalize=False).to(depth.device) # Nxhxwx2
depth = depth.unsqueeze(-1)
grid_3d = torch.cat((grid_2d, torch.ones_like(depth)), dim=3)
grid_3d = grid_3d.matmul(inv_K.transpose(2, 1)) * depth
return grid_3d
def get_normal_from_depth(depth):
b, h, w = depth.shape
grid_3d = depth_to_3d_grid(depth, inv_K) # change
tu = grid_3d[:,1:-1,2:] - grid_3d[:,1:-1,:-2]
tv = grid_3d[:,2:,1:-1] - grid_3d[:,:-2,1:-1]
normal = tu.cross(tv, dim=3)
zero = normal.new_tensor([0,0,1])
normal = torch.cat([zero.repeat(b,h-2,1,1), normal, zero.repeat(b,h-2,1,1)], 2)
normal = torch.cat([zero.repeat(b,1,w,1), normal, zero.repeat(b,1,w,1)], 1)
normal = normal / (((normal**2).sum(3, keepdim=True))**0.5 + EPS)
return normal
if __name__ == "__main__":
input_path = '/root/3dface/unsup3d_modified/demo/images/depth_test/000008_depth_1_1.png' # input depth image
depth_input = cv2.imread(input_path)
# print(depth_input) # ok
canon_depth = torch.Tensor(depth_input).permute(2, 0, 1) # 3*256*256
canon_depth = canon_depth.mean(axis=0, keepdims=True) # 1*256*256, average from 3*256*256
# canon_depth = torch.Tensor(depth_input) # use original depth image: no use.
## predict canonical depth (rescale)
b = 1 # 在torch.cat之前b=1
canon_depth = canon_depth - canon_depth.view(b, -1).mean(1).view(b, 1, 1) # no error
canon_depth = canon_depth.tanh() # rescale to (-1, 1)
canon_depth = depth_rescaler(canon_depth)
## clamp border depth
h = w = image_size
canon_depth = canon_depth.to(torch.device('cuda:1')) # deploy canon_depth to cuda
depth_border = torch.zeros(1, h, w - 4).to(torch.device('cuda:1')) # deploy depth_border to cuda
depth_border = nn.functional.pad(depth_border, (2, 2), mode='constant', value=1)
canon_depth = canon_depth * (1 - depth_border) + depth_border * border_depth
canon_depth = torch.cat([canon_depth, canon_depth.flip(2)], 0) # cat at axis=0, 1*256*256->2*256*256
# canon_depth = canon_depth.to(torch.device('cuda:1')) # deploy tensor to cuda
print(canon_depth) # debug
pdb.set_trace() # set bp
canon_normal = get_normal_from_depth(canon_depth) # dim=2*256*256*3
## export to obj strings
vertices = depth_to_3d_grid(canon_depth, inv_K) # BxHxWx3, B=2(origin+flip) gaidong; both deployed on cuda; dim=2*256*256*3
objs, mtls = export_to_obj_string(vertices, canon_normal) # error: thread stuck at this line
# 将with open as改写成open
f = open(os.path.join(save_dir, 'result.mtl'), "w")
f.write(mtls[0].replace('$TXTFILE', input_path))
f.close()
f = open(os.path.join(save_dir, 'result.obj'), "w")
f.write(objs[0].replace('$MTLFILE', './result.mtl'))
f.close()
HI, @elliottwu , when I tried to use unsup3d model to another dataset, I encountered the following situation: I have trained unsup3d on the dataset for 30 epochs before, but when I tried to use the well-trained model to predict 3D human face from the input human face images, I got the following intermediate results:
00001_canonical_albedo.png 00001_canonical_image.png 00001_recon_image_flip.png 00001_recon_image.png
It seems that the reconstruction face retains only half of the original face but loses the other half, and the canonical face then flips that only half of face, leading to a weird result (just a guess, can't make it exact). In this case, what should I do to make the model output a better and a correct result? Maybe relevant to the quality of dataset? Or maybe relevant to the parameters of the dataloader, like 'image_size' or 'crop'? Really hope to hear your points and advice on my problem, thank you!
同学你好!@YokkaBear我在复现这篇论文时也出现了重建出的人脸图像只有一半的情况,请问你解决这个问题了吗?期待你的回复,谢谢!
总的来说是数据集的问题,建议做一些数据预处理(比如像作者建议的那样切分出关键域)/数据增强,或者更换数据集。
王尤嘉 | |
---|---|
@.*** | 签名由网易邮箱大师定制
On 11/22/2021 @.***> wrote:
HI, @elliottwu , when I tried to use unsup3d model to another dataset, I encountered the following situation: I have trained unsup3d on the dataset for 30 epochs before, but when I tried to use the well-trained model to predict 3D human face from the input human face images, I got the following intermediate results:
00001_canonical_albedo.png 00001_canonical_image.png 00001_recon_image_flip.png 00001_recon_image.png
It seems that the reconstruction face retains only half of the original face but loses the other half, and the canonical face then flips that only half of face, leading to a weird result (just a guess, can't make it exact). In this case, what should I do to make the model output a better and a correct result? Maybe relevant to the quality of dataset? Or maybe relevant to the parameters of the dataloader, like 'image_size' or 'crop'? Really hope to hear your points and advice on my problem, thank you!
@.***我在复现这篇论文时也出现了重建出的人脸图像只有一半的情况,请问你解决这个问题了吗?期待你的回复,谢谢!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
总的来说是数据集的问题,建议做一些数据预处理(比如像作者建议的那样切分出关键域)/数据增强,或者更换数据集。 | | 王尤嘉 | | @. | 签名由网易邮箱大师定制 On 11/22/2021 @.> wrote: HI, @elliottwu , when I tried to use unsup3d model to another dataset, I encountered the following situation: I have trained unsup3d on the dataset for 30 epochs before, but when I tried to use the well-trained model to predict 3D human face from the input human face images, I got the following intermediate results: 00001_canonical_albedo.png 00001_canonical_image.png 00001_recon_image_flip.png 00001_recon_image.png It seems that the reconstruction face retains only half of the original face but loses the other half, and the canonical face then flips that only half of face, leading to a weird result (just a guess, can't make it exact). In this case, what should I do to make the model output a better and a correct result? Maybe relevant to the quality of dataset? Or maybe relevant to the parameters of the dataloader, like 'image_size' or 'crop'? Really hope to hear your points and advice on my problem, thank you! @.***我在复现这篇论文时也出现了重建出的人脸图像只有一半的情况,请问你解决这个问题了吗?期待你的回复,谢谢! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
好的,感谢您!
Congrats on your best paper award and thank you for your generous open source.
Now that I have been through the training process, and obtain a model trained for 70 epochs. After I run the test code of the model on the test set, I got a series of directories consisting of images used for 3D reconstruction. However, I did not find the file to output the evaluation metrics like "scale-invariant depth error (SIDE)" or "mean angle deviation (MAD)" as mentioned in the paper.
So what I wonder is how to output or where I can find these evaluation metrics to measure the performance of my trained model, and if any ground truth data needed for this evaluation process.
Looking forward to your reply and help, much thanks.