cuiziteng / Aleth-NeRF

🌕 [AAAI 2024] Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption (Low-light enhance / Exposure correction + NeRF)
Apache License 2.0
76 stars 5 forks source link

RuntimeError: number of dims don't match in permute #2

Closed ze-n-g closed 1 year ago

ze-n-g commented 1 year ago

非常感谢你们的工作, 当我运行 CUDA_VISIBLE_DEVICES=0 python3 run.py --ginc configs/LOM/aleth_nerf/aleth_nerf_buu.gin --eta 0.1 出现错误RuntimeError: number of dims don't match in permute 打印输出: the scene name is: buu the log dir is: ./logs/aleth_nerf_blender_buu_220901eta0.1 Global seed set to 220901 GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs /home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/callba cks/model_checkpoint.py:613: UserWarning: Checkpoint directory /home/zenglongjian/Aelth-NeRF/l ogs/aleth_nerf_blender_buu_220901eta0.1 exists and is not empty. rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.") LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | model | Aleth_NeRF | 1.3 M

1.3 M Trainable params 0 Non-trainable params 1.3 M Total params 5.296 Total estimated model params size (MB) Epoch 0: 100%|█| 12523/12523 [59:02<00:00, 3.54it/s, loss=0.000127, v_num=0, train/psnr1=43.70, train/psnrDownloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /home/zenglongjian/.cache/torch/hub/checkpoints/vgg16-397923af.pth 100%|███████████████████████████████████████████████████████████████████| 528M/528M [01:02<00:00, 8.81MB/s] Downloading: "https://github.com/richzhang/PerceptualSimilarity/raw/master/lpips/weights/v0.1/vgg.pth" to /home/zenglongjian/.cache/torch/hub/checkpoints/vgg.pth█████████████████▉| 527M/528M [01:02<00:00, 7.53MB/s] 100%|█████████████████████████████████████████████████████████████████| 7.12k/7.12k [00:00<00:00, 3.37MB/s] Epoch 9: 100%|█| 12523/12523 [58:35<00:00, 3.56it/s, loss=9.05e-05, v_num=0, train/psnr1=44.00, train/psnrTrainer.fit stopped: max_steps=125000 reached. Epoch 9: 100%|█| 12523/12523 [58:35<00:00, 3.56it/s, loss=9.05e-05, v_num=0, train/psnr1=44.00, train/psnr the checkpoint path is: ./logs/aleth_nerf_blender_buu_220901eta0.1/last.ckpt Restoring states from the checkpoint path at ./logs/aleth_nerf_blender_buu_220901eta0.1/last.ckpt Lightning automatically upgraded your loaded checkpoint from v1.7.6 to v1.9.5. To apply the upgrade to yourles permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint --file logs/aleth_nerf_blendbuu_220901eta0.1/last.ckpt LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] Loaded model weights from checkpoint at ./logs/aleth_nerf_blender_buu_220901eta0.1/last.ckpt Testing DataLoader 0: 100%|███████████████████████████████████████████████████| 69/69 [00:15<00:00, 4.44itTraceback (most recent call last): File "run.py", line 244, in run( File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/gin/config.py", line 1605, inn_wrapper utils.augment_exception_message_and_reraise(e, err_str) File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/gin/utils.py", line 41, in aunt_exception_message_and_reraise raise proxy.with_traceback(exception.traceback) from None File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/gin/config.py", line 1582, inn_wrapper return fn(*new_args, new_kwargs) File "run.py", line 191, in run trainer.test(model, data_module, ckpt_path=ckpt_path) File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/trainer/trar.py", line 794, in test return call._call_and_handle_interrupt( File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/trainer/caly", line 38, in _call_and_handle_interrupt return trainer_fn(*args, *kwargs) File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/trainer/trar.py", line 842, in _test_impl results = self._run(model, ckpt_path=self.ckpt_path) File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/trainer/trar.py", line 1112, in _run results = self._run_stage() File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/trainer/trar.py", line 1188, in _run_stage return self._run_evaluate() File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/trainer/trar.py", line 1228, in _run_evaluate eval_loop_results = self._evaluation_loop.run() File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/loops/loop., line 206, in run output = self.on_run_end() File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/loops/dataler/evaluation_loop.py", line 180, in on_run_end self._evaluation_epoch_end(self._outputs) File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/loops/dataler/evaluation_loop.py", line 288, in _evaluation_epoch_end self.trainer._call_lightning_module_hook(hook_name, output_or_outputs) File "/home/zenglongjian/.conda/envs/aleth_nerf/lib/python3.8/site-packages/pytorch_lightning/trainer/trar.py", line 1356, in _call_lightning_module_hook output = fn(args, kwargs) File "/home/zenglongjian/Aelth-NeRF/src/model/aleth_nerf/model.py", line 424, in test_epoch_end darknesss = self.alter_gather_cat_conceil(outputs, "darkness", all_image_sizes) File "/home/zenglongjian/Aelth-NeRF/src/model/interface.py", line 52, in alter_gather_cat_conceil all = all.permute((1, 0, 2)).flatten(0, 1) RuntimeError: number of dims don't match in permute In call to configurable 'run' (<function run at 0x7efb2cf63550>) Testing DataLoader 0: 100%|██████████| 69/69 [00:16<00:00, 4.31it/s]

cuiziteng commented 1 year ago

Hi, thanks for interesting in our work, it seem that this error is happen in the inference time when store the concealing field, I'll check the error these days.

By the way, you could first remove these codes (line 424 and line 425 in model.py ) and check it again, also you could only check in the inference time, so you do not need training from scratch.

ze-n-g commented 1 year ago

Thank you for your reply, please let me know if there is an update on the relevant question

cuiziteng commented 1 year ago

Hi, I have remove these 2 lines (line 424 and line 425 in model.py, these two lines have been removed now so you don't need edit), and tried again with a single GPU on LOM dataset (buu scene). It could generate the rendering results, you could try the code again now.

Notice you could add _"--ginb run.runtrain=False" in the command for evaluation only (don't need retrain buu scene).

As follow:

image
ze-n-g commented 1 year ago

Thank you very much for your help, now it works.

cuiziteng commented 1 year ago

You are welcome ~