While running the multi-gpu examples using the readme steps :
Traceback (most recent call last):
File "/home/user/projects/threestudio/launch.py", line 304, in <module>
main(args, extras)
File "/home/user/projects/threestudio/launch.py", line 247, in main
trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
File "/home/vghorpad/.conda/envs/training/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
call._call_and_handle_interrupt(
File "/home/user/.conda/envs/training/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/home/user/.conda/envs/training/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/multiprocessing.py", line 113, in launch
mp.start_processes(
File "/home/user/.conda/envs/training/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
process.start()
File "/home/user/.conda/envs/training/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/home/user/.conda/envs/training/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
return Popen(process_obj)
File "/home/user/.conda/envs/training/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/home/user/.conda/envs/training/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/home/user/.conda/envs/training/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/home/user/.conda/envs/training/lib/python3.10/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_activation.<locals>.<lambda>'
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
While running the multi-gpu examples using the readme steps :
Looking up the suggestion for similar error using https://discuss.huggingface.co/t/cant-pickle-error-using-accelerate-multi-gpu/32358 Found that the error arises from https://github.com/threestudio-project/threestudio/blob/main/launch.py#L169
this is not a machine or driver issue as I confirmed the following: