MLP to paramterize SDF - Githubissues

JiyouSeo commented 2 years ago

Hello, thank you for your great work. In #9 and in Paper 8.5, You expressed you used DVR based MLP to optimize SDF. And I have several question about it.

Can you tell me how many iter you have run to pre-train sphere?
Did you set the MLP output shape as (size, sdf) which mean (size,1) or (size, sdf+deform) which mean (size, 4) ?
Did you use Modulelist in DVR decoder to process latent condition code c too?

jmunkberg commented 2 years ago

Thanks @JiyouSeo

For sphere initialization, we didn't pre-train, but instead initialized the SDF directly as something like
```
# Sphere init
sdf = (util.length(self.verts) / scale) - 0.45
```
around https://github.com/NVlabs/nvdiffrec/blob/main/geometry/dmtet.py#L177

For 2-3, maybe @frankshen07 can comment further.

JiyouSeo commented 2 years ago

Thanks for responding me @jmunkberg !! I was confused you did pre-train for sphere initialization. Now, I see. but I have several question more.

Is there any reason for initializing to sphere in DTU dataset differently??
How did you pre-train MLP then? It means dmtet optimization around https://github.com/NVlabs/nvdiffrec/blob/main/train.py#L594?
What did you use as input of MLP? Please let me know what is input, output of MLP.

I'm confused you replace below codes(around https://github.com/NVlabs/nvdiffrec/blob/main/geometry/dmtet.py#L179) with MLP architecture.

self.sdf    = torch.nn.Parameter(sdf.clone().detach(), requires_grad=True)
self.register_parameter('sdf', self.sdf)

self.deform = torch.nn.Parameter(deform, requires_grad=True)
self.register_parameter('deform', self.deform)

frankshen07 commented 2 years ago

Hello @JiyouSeo ,

The pretraining is straightforward – in each iteration, we randomly sample 1000 points inside a unit cube and compute MSE loss between the predicted SDF at these points and SDF of a sphere (with radius of 0.4). We train for 1000 iterations with lr of 1e-3.
We use the network from https://github.com/autonomousvision/differentiable_volumetric_rendering/blob/5a190104b9f8143125beed714d33f265f5006f30/im2mesh/dvr/models/decoder.py#L7. The input is 3D points without condition code c. The output stays the same, and the last 3 channels are interpreted as deformation vector.

Please let me know if anything is unclear.

JiyouSeo commented 2 years ago

Thank you for Replaying @frankshen07 !! I have 2 questions.

So DVR decoder get self.vert as input??

I replaced self.sdf, self.deform from nn.Parameter to outputs of mlp(torch.tensor type), but I got a type error when calculating loss.backward() around https://github.com/NVlabs/nvdiffrec/blob/main/render/mlptexture.py#L73.

Traceback (most recent call last):
File "train_ori.py", line 729, in <module>
geometry, mat = optimize_mesh(glctx, geometry, mat, lgt, dataset_train, dataset_validate, 
File "train_ori.py", line 471, in optimize_mesh
total_loss.backward()
File "/opt/conda/lib/python3.8/site-packages/torch/_tensor.py", line 352, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward
Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
File "/opt/conda/lib/python3.8/site-packages/torch/utils/hooks.py", line 175, in hook
res = user_hook(self.module, grad_inputs, self.grad_outputs)
File "/workspace/render/mlptexture.py", line 73, in <lambda>
self.encoder.register_full_backward_hook(lambda module, grad_i, grad_o: (grad_i[0] / gradient_scaling, ))

I guess the reason of error is that type of grad_i[0] is None..
Here is my code which declare mlp network and pretrain to sphere in DMTetGeometry init,

self.mlp = Decoder(FLAGS=FLAGS, multires=6).to('cuda')
self.mlp.pre_train_sphere(1000)

and this is forward process of mlp in getMesh function around https://github.com/NVlabs/nvdiffrec/blob/main/geometry/dmtet.py#L196

self.mlp.train()
pred = self.mlp(self.points)
self.sdf, self.deform = pred[:,0].detach() - 0.1 , pred[:,1:].detach()

If it is not correct, how can you replace self.sdf, self.deform from nn.Parameter to outputs of mlp(torch.Tensor type)??

frankshen07 commented 2 years ago

Yes
Did you remove those two lines (also for sdf)? self.deform = torch.nn.Parameter(deform, requires_grad=True) self.register_parameter('deform', self.deform)

JiyouSeo commented 2 years ago

@frankshen07 Yes I removed

self.sdf    = torch.nn.Parameter(sdf.clone().detach(), requires_grad=True)
self.register_parameter('sdf', self.sdf)

self.deform = torch.nn.Parameter(deform, requires_grad=True)
self.register_parameter('deform', self.deform)

Do I have to use them too?

JiyouSeo commented 2 years ago

I solved above issue replacing pred[:,0].detach() to pred[:,0] It is because detach() method except the pred[:,0] from requires_grad. But I got new error

Traceback (most recent call last):
  File "train_ori.py", line 736, in <module>
    geometry, mat = optimize_mesh(glctx, geometry, mat, lgt, dataset_train, dataset_validate, 
  File "train_ori.py", line 439, in optimize_mesh
    result_image, result_dict = validate_itr(glctx, prepare_batch(next(v_it), FLAGS.background), geometry, opt_material, lgt, FLAGS)
  File "train_ori.py", line 199, in validate_itr
    buffers = geometry.render(glctx, target, lgt, opt_material)
  File "/workspace/geometry/dmtet.py", line 253, in render
    return render.render_mesh(glctx, opt_mesh, target['mvp'], target['campos'], lgt, target['resolution'], spp=target['spp'], 
  File "/workspace/render/render.py", line 214, in render_mesh
    assert mesh.t_pos_idx.shape[0] > 0, "Got empty training triangle mesh (unrecoverable discontinuity)"
AssertionError: Got empty training triangle mesh (unrecoverable discontinuity)

I guess sdf cannot optimize on the range [-0.1, 0.9]. On the below, there is example of max&min value of sdf when I got error.

tensor(0.1004, device='cuda:0') tensor(0.0342, device='cuda:0')

Did you do any normalization or Clamp SDF value(output of mlp) ??

JiyouSeo commented 2 years ago

I solved ! Thank you @frankshen07

bigfudge2123 commented 2 years ago

Hi,would you mind opening the code about dmet.py? I'm stuck in the same situation. @JiyouSeo Thanks

Shubhendu-Jena commented 9 months ago

Hi, how did you solve the error @JiyouSeo ? Did you clamp the sdf or the deformation?

santisy commented 2 months ago

@JiyouSeo Hi, I also got the "Got empty training triangle mesh (unrecoverable discontinuity)" issue? How did you solve this?

NVlabs / nvdiffrec

MLP to paramterize SDF #81