kakaobrain / nerf-factory

An awesome PyTorch NeRF library
https://kakaobrain.github.io/NeRF-Factory
Apache License 2.0
1.27k stars 107 forks source link

Error when running. Possible version error with pytorch_lightning #36

Open josemonsalve2 opened 1 year ago

josemonsalve2 commented 1 year ago

Hi,

I am having issues running this project. I suspect it's an issue with the version of pytorch_lightning.

Here's the output:

> python3 -m run --ginc configs/nerf/blender.gin
Traceback (most recent call last):
  File "!/anaconda3/envs/nerf_factory/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "~/anaconda3/envs/nerf_factory/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "~/nerf-factory/run.py", line 23, in <module>
    from pytorch_lightning.plugins import DDPPlugin
ImportError: cannot import name 'DDPPlugin' from 'pytorch_lightning.plugins' (~/anaconda3/envs/nerf_factory/lib/python3.8/site-packages/pytorch_lightning/plugins/__init__.py)

System:

I'm running on a Mac studio with MacOS Ventura 13.3.1.

Here are the versions installed by pip -r requirements.txt:

> pip3 install -r requirements.txt
...
Successfully built torch-scatter torch-efficient-distloss pathtools
Installing collected packages: multidict, frozenlist, yarl, smmap, attrs, async-timeout, aiosignal, soupsieve, packaging, gitdb, fsspec, aiohttp, tqdm, torchmetrics, tifffile, setproctitle, sentry-sdk, scipy, PyYAML, PyWavelets, psutil, protobuf, pathtools, networkx, lightning-utilities, lazy-loader, imageio, GitPython, filelock, docker-pycreds, Click, beautifulsoup4, appdirs, wandb, torch-scatter, torch-efficient-distloss, scikit-image, pytorch-lightning, piqa, opencv-python, ninja, imageio-ffmpeg, gin-config, gdown, functorch, configargparse
Successfully installed Click-8.1.3 GitPython-3.1.31 PyWavelets-1.4.1 PyYAML-6.0 aiohttp-3.8.4 aiosignal-1.3.1 appdirs-1.4.4 async-timeout-4.0.2 attrs-22.2.0 beautifulsoup4-4.12.2 configargparse-1.5.3 docker-pycreds-0.4.0 filelock-3.11.0 frozenlist-1.3.3 fsspec-2023.4.0 functorch-0.1.1 gdown-4.7.1 gin-config-0.5.0 gitdb-4.0.10 imageio-2.27.0 imageio-ffmpeg-0.4.8 lazy-loader-0.2 lightning-utilities-0.8.0 multidict-6.0.4 networkx-3.1 ninja-1.11.1 opencv-python-4.7.0.72 packaging-23.1 pathtools-0.1.2 piqa-1.2.2 protobuf-4.22.3 psutil-5.9.4 pytorch-lightning-2.0.1.post0 scikit-image-0.20.0 scipy-1.9.1 sentry-sdk-1.19.1 setproctitle-1.3.2 smmap-5.0.0 soupsieve-2.4 tifffile-2023.4.12 torch-efficient-distloss-0.1.3 torch-scatter-2.1.1 torchmetrics-0.11.4 tqdm-4.65.0 wandb-0.14.2 yarl-1.8.2

Similar problems found in:

https://github.com/Lightning-AI/lightning/issues/17191

That hints to this migration guide:

https://lightning.ai/docs/pytorch/stable/upgrade/migration_guide.html

Possible temporary solution

Use pytorch_lightning==1.9.5 in requirements.txt

dogyoonlee commented 1 year ago

I have same issue!

After I change my pytorch_lightning==1.9.5, then 'BatchSampler' error occurs. Last part of the error message is as follows.

In call to configurable 'run' (<function run at 0x7f7fb3ef59d0>)
    dataloader = _update_dataloader(dataloader, sampler, mode=mode)
  File "/compuworks/anaconda3/envs/nerf_factory/lib/python3.9/site-packages/pytorch_lightning/utilities/data.py", line 157, in _update_dataloader
    dl_args, dl_kwargs = _get_dataloader_init_args_and_kwargs(dataloader, sampler, mode)
  File "/compuworks/anaconda3/envs/nerf_factory/lib/python3.9/site-packages/pytorch_lightning/utilities/data.py", line 218, in _get_dataloader_init_args_and_kwargs
    dl_kwargs.update(_dataloader_init_kwargs_resolve_sampler(dataloader, sampler, mode, disallow_batch_sampler))
  File "/compuworks/anaconda3/envs/nerf_factory/lib/python3.9/site-packages/pytorch_lightning/utilities/data.py", line 342, in _dataloader_init_kwargs_resolve_sampler
    raise MisconfigurationException(
lightning_fabric.utilities.exceptions.MisconfigurationException: We tried to re-instantiate your custom batch sampler and failed. To mitigate this, either follow the API of `BatchSampler` or instantiate your custom batch sampler inside `*_dataloader` hooks of your module.
  In call to configurable 'run' (<function run at 0x7f66310e29d0>)

Did you solved??

19991105 commented 1 year ago

I have encountered the same problem before, until I installed Pytorch_ Righting=1.6.0, this issue has been resolved.