szymanowiczs / splatter-image

Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
https://szymanowiczs.github.io/splatter-image
BSD 3-Clause "New" or "Revised" License
795 stars 54 forks source link

Reproducing results in the paper #11

Closed ngailapdi closed 7 months ago

ngailapdi commented 7 months ago

Hi, thank you for the great work.

I followed the default parameters in the config file and trained on SRN chairs single-view. However, I did not get the same results as in the paper at the end.

PSNR_novel: 18.43
LPIPS_novel: 0.10
SSIM_novel: 0.89

I wonder if the hyperparameters used to train your model differ from the default in the config file? Can you please provide the hyperparameters that you used?

mengxuyiGit commented 7 months ago

Hi! That's amazing you can train the network! Have you encountered any problem of being killed when training? Thanks!

image
mengxuyiGit commented 7 months ago

and my lpips loss remains unchanged during training, how about your training loss curves? Many thanks!

image
ngailapdi commented 7 months ago

and my lpips loss remains unchanged during training, how about your training loss curves? Many thanks! image

Hi, lpips loss does not get applied until 800K steps so you won't see any change until then

yuchenlichuck commented 7 months ago
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off] [10/01 17:01:14]
/home/dubaiprince/miniconda3/envs/gsgen/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/home/dubaiprince/miniconda3/envs/gsgen/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Loading model from: /home/dubaiprince/miniconda3/envs/gsgen/lib/python3.9/site-packages/lpips/weights/v0.1/vgg.pth [10/01 17:01:15]
Beginning training [10/01 17:01:15
yuchenlichuck commented 7 months ago

My code is stuck here

szymanowiczs commented 7 months ago

At a glance, the parameters in the provided config should take you to the same results as in the paper. @ngailapdi can you share what command you used to train the model? how many iterations did you train for? what do your validation PSNR curves look like? Do the visualisations look reasonable? Can you share your config.yaml file that the code output?

ngailapdi commented 7 months ago

At a glance, the parameters in the provided config should take you to the same results as in the paper. @ngailapdi can you share what command you used to train the model? how many iterations did you train for? what do your validation PSNR curves look like? Do the visualisations look reasonable? Can you share your config.yaml file that the code output?

@szymanowiczs Hi, thank you for your response. This is the command I use to train the model on SRN chair python train_network.py +dataset=[chairs] I trained for 1,000,000 iterations, here is the PSNR novel curve

image

and here is the visualization at the last iteration

image

and here is the output config.yaml. I just took a look at the appendix, it seems like the learning rate in the default config is different from the one reported in the appendix (5e-5 vs 2e-4), I'm not sure with 1M iterations if this will make a big difference.

wandb_version: 1

wandb:
  desc: null
  value:
    project: gs_pred
cam_embd:
  desc: null
  value:
    embedding: null
    encode_embedding: null
    dimension: 0
    method: null
general:
  desc: null
  value:
    device: 0
    random_seed: 0
data:
  desc: null
  value:
    training_resolution: 128
    fov: 51.98948897809546
    subset: -1
    input_images: 1
    origin_distances: false
    rescale_to_cars: false
    set_to_pm_1_range: true
    transform_shs: true
    depth_rendering: false
    znear: 1.25
    zfar: 2.75
    category: chairs
    white_background: true
opt:
  desc: null
  value:
    iterations: 1000001
    base_lr: 5.0e-05
    batch_size: 8
    betas:
    - 0.9
    - 0.999
    loss: l2
    imgs_per_obj: 4
    ema:
      use: true
      update_every: 10
      update_after_step: 100
      beta: 0.9999
    lambda_lpips: 0.01
    start_lpips_after: 800001
    pretrained_ckpt: null
    step_lr_at: 800001
diffusion:
  desc: null
  value:
    channel_mult_noise: 0
model:
  desc: null
  value:
    max_sh_degree: 1
    inverted_x: false
    inverted_y: true
    name: SingleUNet
    out_channels: 64
    opacity_scale: 0.001
    opacity_bias: -2.0
    scale_bias: 0.02
    scale_scale: 0.001
    xyz_scale: 1.0e-06
    xyz_bias: 0.0
    depth_scale: 1.0
    depth_bias: 0.0
    network_without_offset: false
    network_with_offset: true
    attention_resolutions:
    - 16
    num_blocks: 4
    cross_view_attention: true
    base_dim: 128
    isotropic: false
    anneal_opacity: true
logging:
  desc: null
  value:
    ckpt_iterations: 1000
    val_log: 10000
    loss_log: 10
    loop_log: 10000
    render_log: 10000
_wandb:
  desc: null
  value:
    python_version: 3.10.13
    cli_version: 0.16.1
    framework: huggingface
    huggingface_version: 4.33.3
    is_jupyter_run: false
    is_kaggle_kernel: false
    start_time: 1703174534.939141
    t:
      1:
      - 1
      - 11
      - 41
      - 49
      - 50
      - 55
      - 105
      2:
      - 1
      - 11
      - 41
      - 49
      - 50
      - 55
      - 105
      3:
      - 2
      - 5
      - 14
      - 16
      - 23
      4: 3.10.13
      5: 0.16.1
      6: 4.33.3
      8:
      - 5
      13: linux-x86_64

I think there are a couple factors that could affect the performance

  1. Different learning rates
  2. Did you use camera embedding? In the default config it seems like the model does not use camera embedding
szymanowiczs commented 7 months ago

OK thanks for letting me know. at a glance the hyper parameters look fine. No camera embedding is needed. I'll dig in on my side and get back to you.

szymanowiczs commented 7 months ago

My code is stuck here

@yuchenlichuck is the code actually stuck or are just not seeing any print statements? if you're not seeing any print statements that's fine - the code doesn't print output until 10k iterations in when it begins validation.

szymanowiczs commented 7 months ago

Hi @ngailapdi can you pull the latest version of the code and try again? I ran training with the released code and got results like these: blue is cars, grey is chairs. So within 100k iterations you should expect to get novel PSNR on the validation set of 22.8 for cars and 22.0 for chairs. Let me know if it works for you.

Screenshot 2024-01-22 at 09 21 46

yuchenlichuck commented 7 months ago

My code is stuck here

@yuchenlichuck is the code actually stuck or are just not seeing any print statements? if you're not seeing any print statements that's fine - the code doesn't print output until 10k iterations in when it begins validation.

Yes, you are right, it fixed now

szymanowiczs commented 7 months ago

Great, I'm closing the issue for now - if you encounter further issues with reproducing results on ShapeNet classes feel free to reopen it.

SJTUwxz commented 5 months ago

Hi! Thanks again for the great work!

I'm reopening this issue again as I ran the training using default parameters twice, but neither of which gets the same result on SRN-cars and SRN-chairs test dataset as reported in the paper (especially PSNR and LPIPS metrics). Please see the results below. Evaluating using the released checkpoint gets very similar results though.

result-replication

I'd really appreciate it if you could provide some insights about this issue. Thank you in advance!

szymanowiczs commented 5 months ago

Thanks for highlighting. I'll look into this - my intuition is that this could be due to stochasticity in training, so I'll see if there is a way to fix it to make sure results can be reproduced exactly.

SJTUwxz commented 5 months ago

Thanks for your response! I will look forward to that!