PRBonn / MaskPLS

Mask-Based Panoptic LiDAR Segmentation for Autonomous Driving, RA-L, 2023
MIT License
54 stars 7 forks source link

TypeError: can't pickle MinkowskiConvolutionFunction objects #5

Closed comradexy closed 1 year ago

comradexy commented 1 year ago

When I run " python scripts/evaluate_model.py --w ckpt/mask_pls_kitti.ckpt", I got the error below:

/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/MinkowskiEngine/init.py:42: UserWarning: The environment variable OMP_NUM_THREADS not set. MinkowskiEngine will automatically set OMP_NUM_THREADS=16. If you want to set OMP_NUM_THREADS manually, please export it on the command line before running a python script. e.g. export OMP_NUM_THREADS=12; python your_program.py. It is recommended to set it below 24. "It is recommended to set it below 24.", /home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:850: UserWarning: You requested multiple GPUs but did not specify a backend, e.g. Trainer(strategy="dp"|"ddp"|"ddp2"). Setting strategy="ddp_spawn" for you. "You requested multiple GPUs but did not specify a backend, e.g." GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7] Traceback (most recent call last): File "scripts/evaluate_model.py", line 73, in main() File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/click/core.py", line 760, in invoke return __callback(args, *kwargs) File "scripts/evaluate_model.py", line 52, in main trainer.validate(model, data) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 816, in validate return self._call_and_handle_interrupt(self._validate_impl, model, dataloaders, ckpt_path, verbose, datamodule) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt return trainer_fn(args, **kwargs) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 859, in _validate_impl results = self._run(model, ckpt_path=self.validated_ckpt_path) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1194, in _run self._dispatch() File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1270, in _dispatch self.training_type_plugin.start_evaluating(self) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 178, in start_evaluating self.spawn(self.new_process, trainer, self.mp_queue, return_result=False) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 201, in spawn mp.spawn(self._wrapped_function, args=(function, args, kwargs, return_queue), nprocs=self.num_processes) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 179, in start_processes process.start() File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/multiprocessing/process.py", line 112, in start self._popen = self._Popen(self) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/multiprocessing/popen_fork.py", line 20, in init self._launch(process_obj) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/home/dxy/anaconda3/envs/maskpls/lib/python3.7/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) TypeError: can't pickle MinkowskiConvolutionFunction objects

I installed all the pkgs in requirement.txt. Now I have no idea about this error. Please help me, if you have any clue.

comradexy commented 1 year ago

Previously, I set the N_GPUs to 2. But now it can run testing successfully when I turn it to 1. Maybe there're some bugs in multiple gpus processing?

rmarcuzzi commented 1 year ago

Hi! It seems like it's a problem with MinkowskiEngine. I didn't run the code with more than 1 GPU so I didn't see the error.

comradexy commented 1 year ago

Hi! This error has been fixed, after I specified accelerator="ddp" in the code on line 47:

trainer = Trainer(gpus=cfg.TRAIN.N_GPUS, accelerator="ddp", logger=False)

Btw, this is an excellent work. Thank you for your contribution!