Not reproducing results with tinycuda-nn

Hello, I am currently trying to use the hash grid from tinycudann plus a pure PyTorch MLP to fit an SDF directly from (3D point - distance) supervision, just like in the example from the I-NGP paper.

I used the following command to fit an SDF on a SMPL human model with I-NGP's implementation:

!python scripts/run.py \
    /content/instant-ngp/data/sdf/origSMPL.obj \
    --marching_cubes_res 512 \
    --n_steps 2000 \
    --save_mesh sdf_SMPL.ply \
    --save_snapshot {snapshot_path}

The results are very good, especially taking into consideration that only 2000 iterations are performed. Screenshot from 2023-02-25 22-16-46

As for my implementation, I have put together a simple system that only contains a training loop, a tinycudann hashgrid, and a PyTorch MLP, which predicts the distance for a given point. As for the sampling, I am doing 4/8 on the surface, 3/8 around it, and 1/8 randomly within the AABB.

The results are not quite as good: Screenshot from 2023-02-25 22-19-22

The model just fails to capture the finer details of the fingers and the head.

I don't understand where the discrepancy could come from. I tried improving on the following aspects:

Parameters of the hash grid: I set the same ones as the ones selected by default by I-NGP: Nmin=16 b=1.38191 F=2 T=2^19 L=16. Increasing the maximum resolution or the codebook size T does not really have an impact, except on speed.
Sampling strategy: it does not seem to matter much, as long as enough samples are drawn.
Normalization: I used both spherical and AABB normalization. Neither of them seems to have a strong effect.
Learning rate: although the proper learning rate with an exponential scheduler improved the results quite a lot, I can't get any more improvements from here.
Batch size: I initially used a modest size of 2^11, but it was hard to get to convergence. I switched to a humongous 2^19, as observed in the paper.
Activations and initialization of the MLP. I couldn't derive any improvements from here.
Adding a bit of weight decay. Even with very low decay values like 1^-7, the surface gets smoothed out, but with no details and with no hands.
Oversampling the more difficult parts like the hands. It does not seem to impact.
A pure MLP. It cannot properly learn the SDF either...
I successfully tried overfitting a reduced number of samples.

Especially due to the fact that a pure, big MLP architecture cannot reach the same results as I-NGP, I suspect there are other critical optimizations that I am not taking into account, or there is some gross error I am committing.

I would greatly appreciate it if others with experience with these systems could share their insight. What are improvements that I-NGP is making that I am not?

NVlabs / instant-ngp

Not reproducing results with tinycuda-nn #1256