bennyguo / instant-nsr-pl

Neural Surface reconstruction based on Instant-NGP. Efficient and customizable boilerplate for your research projects. Train NeuS in 10min!
MIT License
857 stars 84 forks source link

[Discussion] Make `eps` in the finite difference exponentially decreasing #83

Open GCChen97 opened 1 year ago

GCChen97 commented 1 year ago

I just notice that the finite difference in geometry.py is similar to the numerical gradient computation of Neuralangelo (CVPR 2023). Maybe the exponentially decreasing eps strategy in Neuralangelo can be adopted since the eps is constant in this codebase : )

bennyguo commented 1 year ago

I also noticed this paper. Seems like an easy adaptation! I'll integrate this feature soon after some experiments.

bennyguo commented 1 year ago

@GCChen97 @xiaohulihutu @alvaro-budria I've pushed an implementation of Neuralangelo. I haven't got the curvature loss to work so feel free to play with it and make it better :)

bennyguo commented 1 year ago

See here for details.

alvaro-budria commented 1 year ago

Thx! I will check the new code and try to come up with something for the curvature loss.

bennyguo commented 1 year ago

Great! In my experiments, the curvature loss (currently commented out in the config file) works on scene63 but will "blow things up" on scene24 after 15000 iterations. So if you would like to tune the curvature loss, scene24 would be a good start :)

alvaro-budria commented 1 year ago

Hi, I am unsure about the current computation of the Laplacian: laplace = (points_d_sdf[..., 0::2] + points_d_sdf[..., 1::2] - 2 * sdf[..., None]) / (eps ** 2)

It assigns a weight of 2 to the center sample, instead of 6 (the number of "neighbors"). In wikipedia the following formula is shown: Screenshot from 2023-06-04 12-38-42, where setting $\gamma_1 = \gamma_2 = 0$ results in the center being weighted by 6.

bennyguo commented 1 year ago

I think you're right. In the current code, laplace is: $$\left(\frac{\partial^2 f}{\partial x^2},\frac{\partial^2 f}{\partial y^2} ,\frac{\partial^2 f}{\partial z^2} \right)$$ And the curvature loss is: $$\mathcal{L}=\frac{1}{N}\sum\left(\left|\frac{\partial^2 f}{\partial x^2}\right|+\left|\frac{\partial^2 f}{\partial x^2}\right|+\left|\frac{\partial^2 f}{\partial x^2}\right|\right)$$

However, the Laplacian should be: $$\nabla^{2}f = \frac{\partial^2 f}{\partial x^2}+\frac{\partial^2 f}{\partial y^2} +\frac{\partial^2 f}{\partial z^2}$$ and the curvature loss should be (denoted as b): $$\mathcal{L}^{'}=\frac{1}{N}\sum\left|\nabla^{2}f\right|$$

$\mathcal{L}$ and $\mathcal{L}^{'}$ are clearly not the same.

bennyguo commented 1 year ago

I think simply changing

laplace = (points_d_sdf[..., 0::2] + points_d_sdf[..., 1::2] - 2 * sdf[..., None]) / (eps ** 2)

with

laplace = (points_d_sdf[..., 0::2] + points_d_sdf[..., 1::2] - 2 * sdf[..., None]).sum(-1) / (eps ** 2)

should be correct?

alvaro-budria commented 1 year ago

Yes, that seems correct. The code was taking the absolute value of each partial derivative, instead of summing them and then taking the absolute value.

bennyguo commented 1 year ago

Are you interested in verifying the correctness of this new version? i.e., does it consistently bring quality improvements? If it does, I'd appreciate it if you open a PR.

alvaro-budria commented 1 year ago

Sure, I can check if this new curvature penalty improves the results.

The linear warmup for the curvature loss weight is not implemented, and in the paper they mention that this can be important for scenes like dtu24 where there are concave shapes. I was thinking of adding an update to this lambda in an on_train_batch_end callback, but I am unsure this is the right approach with PyTorch Lightning.

There are a few hyperparameters whose value currently differs from that in Neurangelo. In the paper, the hash feature dim is 8, and the max number of hash entries per resolution is $2^{22}$. In the configs, the values are 2 and $2^{19}$. I will keep those in the config to make a fair comparison.

bennyguo commented 1 year ago

The warm-up is implemented -- you can linear increase/decrease a loss weight by assigning a tuple with four numbers: [start_step, start_value, end_value, end_step], then the weight value will linear change from start_value to end_value during start_step to end_step. I listed some differences with the original paper here and I think it would be fine to keep them unchanged for fair comparisons.

alvaro-budria commented 1 year ago

I tried the corrected Laplacian computation with a subset of DTU scan scenes. I could not observe any improvement in terms of PSNR.

PSNR 24 37 40 55 63 65
$\lambda_{curv}=0$ 31.74 26.75 30.83 31.19 35.42 32.54
$\lambda_{curv}=0.1$ 24.05 - - - -
$\lambda_{curv} = 10^{-3}$ 24.06 - - - 31.67 -
$\lambda_{curv} = 5 \cdot 10^{-5}$ 31.60 26.81 30.75 30.83 34.47 32.60

** for the 4th and 5th rows, I added a warmup on the weight of 5000 iterations, as the geometry was getting stuck in the low-curvature sphere initialization.

Looking at the generated images, the surface does seem smoother in the case of DTU24, on the roof (which is actually not desirable): Screenshot from 2023-06-04 21-18-40

but in DTU63, there is no significant change. No curvature loss: Screenshot from 2023-06-04 21-23-09

With $\lambda_{curv} = 5 \cdot 10^{-5}$: Screenshot from 2023-06-04 21-19-43

I suspect that having longer iterations at each level of detail (5000 in the paper vs. 1000 here) is an important factor as it allows setting higher weight to the curvature penalty and gives more time to the net to adapt at each level, but I have not verified.

bennyguo commented 1 year ago

Thank you! I also think the number of training steps is crucial. Considering this should be the correct implementation, could you open a PR? I'll do some experiments on the new code.

flow-specter commented 1 year ago

I tried the corrected Laplacian computation with a subset of DTU scan scenes. I could not observe any improvement in terms of PSNR.

PSNR 24 37 40 55 63 65 � � � � �

0 31.74 26.75 30.83 31.19 35.42 32.54 � � � � �

0.1 24.05 - - - - � � � � �

10 − 3 24.06 - - - 31.67 - � � � � �

5 ⋅ 10 − 5 31.60 26.81 30.75 30.83 34.47 32.60 ** for the 4th and 5th rows, I added a warmup on the weight of 5000 iterations, as the geometry was getting stuck in the low-curvature sphere initialization.

Looking at the generated images, the surface does seem smoother in the case of DTU24, on the roof (which is actually not desirable): Screenshot from 2023-06-04 21-18-40

but in DTU63, there is no significant change. No curvature loss: Screenshot from 2023-06-04 21-23-09

With λcurv=5⋅10−5: Screenshot from 2023-06-04 21-19-43

I suspect that having longer iterations at each level of detail (5000 in the paper vs. 1000 here) is an important factor as it allows setting higher weight to the curvature penalty and gives more time to the net to adapt at each level, but I have not verified.

Hi, how did you add the warmup?

flow-specter commented 1 year ago

I changed some parameters and decayed curvature loss, then I got (DTU 24, psnr=32.0) : image

bennyguo commented 1 year ago

@flow-specter Looks good! Could you share the configuration file?

flow-specter commented 1 year ago

@flow-specter Looks good! Could you share the configuration file?

Sure~ Here is the configuration file:

name: neuralangelo-dtu-wmask-scan24 tag: decayCurvature seed: 42 dataset: name: dtu root_dir: /data/DTU/scan24 cameras_file: cameras.npz img_downscale: 2 n_test_traj_steps: 60 apply_mask: true model: name: neus radius: 1.0 num_samples_per_ray: 1024 train_num_rays: 256 max_train_num_rays: 8192 grid_prune: false grid_prune_occ_thre: 0.001 dynamic_ray_sampling: true batch_image_sampling: true randomized: true ray_chunk: 2048 cos_anneal_end: 500000 learned_background: false background_color: white variance: init_val: 0.3 modulate: false geometry: name: volume-sdf radius: 1.0 feature_dim: 13 grad_type: finite_difference finite_difference_eps: progressive isosurface: method: mc resolution: 512 chunk: 2097152 threshold: 0.0 xyz_encoding_config: otype: ProgressiveBandHashGrid n_levels: 16 n_features_per_level: 8 log2_hashmap_size: 22 base_resolution: 32 per_level_scale: 1.3195079107728942 include_xyz: true start_level: 4 start_step: 20000 update_steps: 5000 mlp_network_config: otype: VanillaMLP activation: ReLU output_activation: none n_neurons: 64 n_hidden_layers: 1 sphere_init: true sphere_init_radius: 0.5 weight_norm: true texture: name: volume-radiance input_feature_dim: 16 dir_encoding_config: otype: SphericalHarmonics degree: 4 mlp_network_config: otype: VanillaMLP activation: ReLU output_activation: none n_neurons: 64 n_hidden_layers: 2 color_activation: sigmoid system: name: neus-system loss: lambda_rgb_mse: 0.0 lambda_rgb_l1: 1.0 lambda_mask: 0.1 lambda_eikonal: 0.1 lambda_curvature:

flow-specter commented 1 year ago

Besides, I found that there is appearance embedding in neuralangelo, would you consider adding this? @bennyguo

bennyguo commented 1 year ago

Besides, I found that there is appearance embedding in neuralangelo, would you consider adding this? @bennyguo

I think the appearance embedding aims to handle varied exposure in the Tanks and Temples dataset. We're currently experimenting on DTU so it's not really needed:)

flow-specter commented 1 year ago

Honestly, I am doing some experiments on Tanks and Templates :)

bennyguo commented 1 year ago

@flow-specter Are you interested in contributing a tanks-and-temples dataset? I'll let you know if I've designed an elegant way to incorporate appearance embeddings.

flow-specter commented 1 year ago

@flow-specter Are you interested in contributing a tanks-and-temples dataset? I'll let you know if I've designed an elegant way to incorporate appearance embeddings.

I am willing to, but I have been busy with work lately. If I have free time to upload, I will let you know~

youmi-zym commented 1 year ago

Hi,

I'm working on tanks-and-temples dataset and got an initial result here. As the author didn't give mesh visualization on the truck dataset, I don't know how it looks compared to Neuralangelo.

Settings: dataset: name: colmap root_dir: /TanksandTemples/colmap_truck img_downscale: 4 up_est_method: ground center_est_method: lookat n_test_traj_steps: 120 apply_mask: false load_data_on_gpu: false model: name: neus radius: 2.0

Others following the dtu config given above.

Furthermore, I didn't add per-image latent embedding to the color network. The background is rendered following NeuS.

https://github.com/bennyguo/instant-nsr-pl/assets/17737537/a54468ce-3649-41d0-815b-d97d0652a116

bennyguo commented 1 year ago

@youmi-zym The results look fair. You may consider setting a smaller radius as a large part of the background is now modeled by the foreground.

youmi-zym commented 1 year ago

Hi,

According to the paper description: image

I think the curve loss weight should be:

lambda_curvature:
- 0
- 0.0
- 0.0005
- 5000

rather than

lambda_curvature:
- 5000
- 0.0005
- 0.0
- 500000

as shown above.

wangyida commented 1 year ago

@flow-specter Are you interested in contributing a tanks-and-temples dataset? I'll let you know if I've designed an elegant way to incorporate appearance embeddings.

I drafted a branch with appearance embeddings, can test it by simply adding the following configurations

    use_appearance_embedding: true
    use_average_appearance_embedding: ture
    appearance_embedding_dim: 17

PSNR seems to be a bit higher for my customized outdoor dataset, although it is not obvious in Mip-NeRF 360 dataset

FangjinhuaWang commented 1 year ago

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

AIBluefisher commented 1 year ago

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

It's amazing the neus with instant-ngp is so good - outperforms the original neus by a large margin. And from the PSNR in my testing, it seems the performance gain of neuralangelo is more from instant-ngp instead of the tricks proposed in the paper.

LiquidAmmonia commented 1 year ago

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi, may I ask on what metrics did you compare your experiments with Geo-Neus, since the Geo-Neus paper only provide Chamfer Distance while this repo seems to only provide a PSNR evaluation.

FangjinhuaWang commented 1 year ago

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi, may I ask on what metrics did you compare your experiments with Geo-Neus, since the Geo-Neus paper only provide Chamfer Distance while this repo seems to only provide a PSNR evaluation.

I am referring to the chamfer distance values of geoneus and neus on tanksandtemples in the paper.

imbinwang commented 1 year ago

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi @FangjinhuaWang , could you share your configuration for tanks&temples? I'm also interested in reproducing the result, but the result mesh of barn shows many holes on ground and roof. 61

MulinYu commented 1 year ago

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi @FangjinhuaWang , could you share your configuration for tanks&temples? I'm also interested in reproducing the result, but the result mesh of barn shows many holes on ground and roof. 61

Hello @imbinwang,

I met the same problem with you. Have you solved this 'hole' problem?

Best Mulin

KeKer7 commented 4 months ago

Could anyone reproduce similar results on tanks&temples, e.g. meeting room and courtroom? I used numerical graident, coarse-to-fine optimization of grid and appearance embedding. however, i did not observe explicit improvement in reconstruction. Besides, it is superising that the baseline neus perfom so well on tanks, better than geo-neus.

Hi @FangjinhuaWang , could you share your configuration for tanks&temples? I'm also interested in reproducing the result, but the result mesh of barn shows many holes on ground and roof. 61

Hello ,@imbinwang , Can you provide me with the TAT configuration file and the version of instant nsr-pl? I am unable to render TAT results using Neus. Thanks!