Open yuchenrao opened 3 years ago
re 1: The the loaded mask during training is further preprocessed by this function here: https://github.com/xiumingzhang/GenRe-ShapeHD/blob/ee42add2707de509b5914ab444ae91b832f75981/models/marrnetbase.py#L104, which converts it into a single channel mask, so there shouldn't be any difference during training & testing, at least for the masking operation. re 2: the original evaluation code is written with an even more outdated version of pytorch used in this repo, which relied on custom cuda kernels for Chamfer distance. What the code did was:
For this repo, we reported numbers on pix3d dataset using the eval code from pix3d.
Thanks a lot for your quick reply!
re-re 1: I think for testing and training, the use different pre-processing methods, correct me if I am wrong:
re-re 2: Thanks a lot for your explanation! I will try to follow your steps and see how It works. Could you also explain more about:
*_gt_rotvox_samescale_128.npz
?Thanks a lot!
@ztzhang Hi~ Here are some updates from my side. Could you check it when you have time? Thanks a lot!
I use the evaluation method from pix3D, and here is a part of the code (in genre_full_model.Model_test)
def test_on_batch(self, batch_i, batch, use_trimesh=True):
outdir = join(self.output_dir, 'batch%04d' % batch_i)
makedirs(outdir, exist_ok=True)
pred = self.predict(batch, load_gt=False, no_grad=True) # not use trimesh
output = self.pack_output(pred, batch, add_gt=False)
self.visualizer.visualize(output, batch_i, outdir)
np.savez(outdir + '.npz', **output)
# calculate CD
pred_vox = output['pred_voxel'][0][0]
pred_vox = self.sigmoid(pred_vox)
# get gt voxel
file1 = batch['rgb_path'][0][:-7] + 'gt_rotvox_samescale_128.npz'
with np.load(file1) as data:
val = data['voxel']
val = np.transpose(val, (0, 2, 1))
val = np.flip(val, 2)
voxel_surface = val - binary_erosion(val, structure=np.ones((3, 3, 3)), iterations=2).astype(float)
voxel_surface = voxel_surface[None, ...]
voxel_surface = np.clip(voxel_surface, 0, 1)
gt_vox = voxel_surface[0]
pred_pts = self.get_pts(pred_vox, 0.4, 1024) # tried 0.3, 0.4, 0.5, the results are showed in 3
gt_pts = self.get_pts(gt_vox, 0.4, 1024)
cd_d = nndistance_score(torch.from_numpy(pred_pts).cuda().unsqueeze(0).float(), torch.from_numpy(gt_pts).cuda().unsqueeze(0).float()) # nndistance in toolbox
def get_pts(self, pred_vox, threshold, pts_size):
empty_voxel = False
if pred_vox.max() < threshold:
# dummy isosurface
empty_voxel = True
points = np.zeros((pts_size, 3))
else:
points = self.get_surface_points(pred_vox, threshold, pts_size) # same function in pix3d
if not empty_voxel:
bound_l = np.min(points, axis=0)
bound_h = np.max(points, axis=0)
points = points - (bound_l + bound_h) / 2
points = points / (bound_h - bound_l).max()
return points
Here is an example for pred_vox (red) v.s. gt_vox (green), which is not aligned with each other, do I need to use camera pose to transfer it?
genre_full_model
and here are the results with different thresholds for different classes, the value for paper is from the Table 1 in GenRe paper, and the results have very big differences.
Do you have any ideas about what's going wrong? Thank you very much!
Hi, I am trying to duplicate your results in the GenRe paper, but I got much worse performance by evaluating with Pix3d method you mentioned before. Could you give me some advice about it?
Before everything, I would like to mentioned that the outputs for test samples seem pretty good to me.
Thanks a lot!