Question about reproducing.

ken2576 / vision-nerf

Official PyTorch Implementation of paper "Vision Transformer for NeRF-Based View Synthesis from a Single Input Image", WACV 2023.

MIT License

107 stars 12 forks source link

Question about reproducing. #9

Closed pansanity666 closed 1 year ago

pansanity666 commented 1 year ago

Hi, I am trying to reproduce the training of your ckpt. However, the code seems not support DDP or DP training&evaluation. Therefore, I tried the default training config where batchsize=1 and trained for 500K iterations. However, the performance is significantly worse than your provided ckpt. Do you have any idea about it ? Best,

ken2576 commented 1 year ago

Hi,

The code supports DDP. I have added the command to the README as well.

python -m torch.distributed.launch --nproc_per_node=[#GPUs] train.py --config [config_path] --distributed

Please follow the training setup we used to ensure the same performance. 16 NVIDIA A100 GPUs, and each GPU has a batch size of 8.

pansanity666 commented 1 year ago

Hi,

The code supports DDP. I have added the command to the README as well.

python -m torch.distributed.launch --nproc_per_node=[#GPUs] train.py --config [config_path] --distributed

Please follow the training setup we used to ensure the same performance. 16 NVIDIA A100 GPUs, and each GPU has a batch size of 8.

Thanks

crazy-stycxj commented 5 months ago

@pansanity666 @ken2576 你们好，我看你复现出来，我在复现时遇到了 File "<__array_function__ internals>", line 200, in stack File "/home4T/cxj/anaconda3/envs/VF/lib/python3.8/site-packages/numpy/core/shape_base.py", line 460, in stack raise ValueError('need at least one array to stack')

能否指点一下呢，谢谢