Closed jinnan-chen closed 1 year ago
Hi, thanks for your interest in our work :)
Please refer to this comment
Thanks
Hi,
It seems the optimization time for each frame is quite long and mesh quality is not consistent based on my hyper parameters, would it be possible to share your pseudo GT meshes for all frames and all characters as well as the evaluation code for mesh L2 Chamfer Distance(CD) and Normal Consistency (NC)? Thanks!
Sure, the reconstructions along with ARAH results (renderings and meshes) are stored at this link
For evaluation, the script is very messy so I just share it here. I will need to find some time to integrate it into the code base.
import trimesh
import torch
import json
import pytorch3d
import pytorch3d.loss
import pytorch3d.ops
import numpy as np
valid_frames = {'313': [0, 30],
'315': [0, 30, 60, 90, 120, 150, 180, 210, 240, 300, 330, 360, 390],
'377': [30, 90, 120],
'386': [150, 180, 270],
'387': [150],
'390': [720],
'392': [30],
'393': [0, 60, 120, 150, 180, 210, 240],
'394': [0, 30, 60, 90, 120, 150, 180, 210, 240, 270],
}
def normal_consistency_vertex(pred_trimesh, gt_trimesh):
"""
:param pred: predicted trimesh
:param gt trimesh: GT mesh trimesh
"""
pred_vertices = np.array(pred_trimesh.vertices)
pred_normals = np.array(pred_trimesh.vertex_normals)
gt_vertices = np.array(gt_trimesh.vertices)
gt_normals = np.array(gt_trimesh.vertex_normals)
pred_verts_torch = torch.from_numpy(pred_vertices).double().unsqueeze(0).cuda()
gt_verts_torch = torch.from_numpy(gt_vertices).double().unsqueeze(0).cuda()
knn_ret = pytorch3d.ops.knn_points(gt_verts_torch, pred_verts_torch)
p_idx = knn_ret.idx.squeeze(-1).detach().cpu().numpy()
pred_normals = pred_normals[p_idx, :]
consistency = 1 - np.linalg.norm(pred_normals - gt_normals, axis=-1).mean()
return consistency
def filter_mesh(mesh, a, b, d, subject):
# Filter out potential floating blobs
labels = trimesh.graph.connected_component_labels(mesh.face_adjacency)
components, cnt = np.unique(labels, return_counts=True)
face_mask = (labels == components[np.argmax(cnt)])
valid_faces = np.array(mesh.faces)[face_mask, ...]
n_vertices = len(mesh.vertices)
vertex_mask = np.isin(np.arange(n_vertices), valid_faces)
mesh.update_faces(face_mask)
mesh.update_vertices(vertex_mask)
if subject in ['313', '315']:
mesh = mesh.slice_plane([0, 0, d], [a, b, 1.0])
else:
mesh = mesh.slice_plane([0, 0, d], [-a, -b, -1.0])
return mesh
nb_losses = []
our_losses = []
nb_nc = []
our_nc = []
with open('ground_planes.json', 'r') as f:
ground_planes = json.load(f)
subject = '377'
a, b, d = ground_planes[subject]
for idx in valid_frames[subject]:
gt = trimesh.load('/media/sfwang/smb/shawang/Research/NeuS/exp/ZJUMoCap_{}/{:06d}/womask_sphere/meshes/00300000.ply'.format(subject, idx+1))
gt = filter_mesh(gt, a, b, d, subject)
# nb = trimesh.load('/home/sfwang/Research/neuralbody/data/result/if_nerf/xyzc_{}/mesh/{:04d}.ply'.format(subject, idx))
# nb = trimesh.load('/home/sfwang/Research/animatable_nerf/data/animation/aninerf_{}/posed_mesh/{:04d}.ply'.format(subject, idx))
nb = trimesh.load('/home/sfwang/sshfs/render_output/neuralbody/meshes_{}/frame{:04d}.ply'.format(subject, idx))
nb = filter_mesh(nb, a, b, d, subject)
# ours = trimesh.load('out/meta-avatar-final/shallow-meta-hierarchical_vol-sdf-uni-sample-16_no-erode_pose-view-noise-rot_geo-latent_color-latent-skip3_raw-view_ZJUMOCAP-{}_keep-aspect_stage3-inner-1e-6-pose-net-1e2_surface-skinning-weight-1e1_inside-weight-1e1_4gpus/meshes_final/{:06d}_implicit.ply'.format(subject, idx))
# ours = filter_mesh(ours, a, b, d, subject)
# gt.export('tmp/gt.ply')
# ours.export('tmp/ours.ply')
# nb.export('tmp/nb.ply')
# import pdb
# pdb.set_trace()
# Normal consistency
nb_nc.append(normal_consistency_vertex(nb, gt))
# our_nc.append(normal_consistency_vertex(ours, gt))
gt_verts = torch.from_numpy(gt.vertices * 100).double().unsqueeze(0).cuda()
nb_verts = torch.from_numpy(nb.vertices * 100).double().unsqueeze(0).cuda()
# our_verts = torch.from_numpy(ours.vertices * 100).double().unsqueeze(0).cuda()
nb_loss = pytorch3d.loss.chamfer_distance(nb_verts, gt_verts)
# our_loss = pytorch3d.loss.chamfer_distance(our_verts, gt_verts)
nb_losses.append(nb_loss[0])
# our_losses.append(our_loss[0])
# print (nb_loss, our_loss)
# nb_loss = pytorch3d.ops.knn_points(gt_verts, nb_verts)
# our_loss = pytorch3d.ops.knn_points(gt_verts, our_verts)
# nb_loss_ = pytorch3d.ops.knn_points(nb_verts, gt_verts)
# our_loss_ = pytorch3d.ops.knn_points(our_verts, gt_verts)
# print (nb_loss_.dists.max(), our_loss_.dists.max())
# nb_losses.append(nb_loss.dists.mean() + nb_loss_.dists.mean())
# our_losses.append(our_loss.dists.mean() + our_loss_.dists.mean())
print (torch.stack(nb_losses, dim=0).mean() / 2.0)
# print (torch.stack(our_losses, dim=0).mean() / 2.0)
print (np.mean(nb_nc))
# print (np.mean(our_nc))
Also, on ZJU-MoCap there were some non-negligible calibration errors for sequences beyond 315. This is also observed by the TAVA paper (Appendix D). Please be aware of this when evaluating 3D reconstructions.
Thanks for your complete reply! May I know the reason why there is only a few frames for certain subjects? Is it because the NEUS could not give a good reconstruction visually? Also, for the calibration errors you mentioned, we could just do not include such samples in evaluation, right?
Yea, for 313 it's because the training set has only 60 frames. For 387, 390, and 392 somehow it's hard to find good frames for reconstruction (by visual inspection), this may be due to the calibration error and the color of the cloth they wear. Eventually, I did not include 387, 390, and 392 for evaluation. Besides I do reconstruction for every 30th frame, similar to the evaluation protocol of novel view synthesis under training poses. Otherwise, the workload of NeuS reconstruction would be too large.
For calibration errors, it happens for certain cameras, maybe even including some (1 or more) training cameras, as I consistently observe some misalignment artifacts for novel view synthesis under certain views, for 377 and beyond. You may exclude them from the camera set but that requires redoing the NeuS reconstruction and also potentially, retraining the models. It's also not clear how to find which camera are faulty - doing some RANSAC-like triangulation might help to find those outlier cameras. Or you can do camera pose optimization like was done in the IDR paper to fix cameras. But I didn't try those as I became aware of this issue after the camera-ready deadline...
Thanks, but the evaluation is conducted on these frames in your code above, right?
valid_frames = {'313': [0, 30], '315': [0, 30, 60, 90, 120, 150, 180, 210, 240, 300, 330, 360, 390], '377': [30, 90, 120], '386': [150, 180, 270], '387': [150], '390': [720], '392': [30], '393': [0, 60, 120, 150, 180, 210, 240], '394': [0, 30, 60, 90, 120, 150, 180, 210, 240, 270], }
Yes, table 2 of our paper reports numbers for sequences except 387, 390, and 392 - the number of good frames is too few for them. Otherwise, it should reproduce the numbers reported in the paper.
Close the issue due to inactivity.
Hi, thanks for great work!
Would it be possible to release the code to get the GT 3d meshes on zju dataset with NEUS? Thanks!