Open rsazid99 opened 2 years ago
Hi @rsazid99 I have met the same bug with you when I'm running semantic segmentation on Toronto3D with RandLA-Net. Have you ever solved this bug? Thanks a lot.
@whuhxb I couldn't solve this bug yet.
This can be because of multiprocessing problems. Try to add num_workers: 0
and pin_memory: false
to the configs/randlanet_s3dis.yml
file in the pipeline section. It solved it for me.
Checklist
master
branch).Describe the issue
When I am trying to run "python scripts/run_pipeline.py torch -c ml3d/configs/randlanet_semantickitti.yml --dataset.dataset_path ../dataset/SemanticKitti --pipeline SemanticSegmentation --dataset.use_cache True --num_workers 0" I am getting "ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'SemSegRandomSampler.get_point_sampler.._random_centered_gen'
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]"
Steps to reproduce the bug
Error message
regular arguments backend: gloo batch_size: null cfg_dataset: null cfg_file: ml3d/configs/randlanet_semantickitti.yml cfg_model: null cfg_pipeline: null ckpt_path: null dataset: null dataset_path: null device: cuda device_ids:
extra arguments dataset.dataset_path: /home/sazid/Open3D-ML/scripts/dataset/SemanticKitti dataset.use_cache: 'True' num_workers: '0'
INFO - 2022-10-20 13:25:56,253 - semantic_segmentation - DEVICE : cuda INFO - 2022-10-20 13:25:56,253 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_torch/log_train_2022-10-20_13:25:56.txt INFO - 2022-10-20 13:25:56,286 - semantickitti - Found 19130 pointclouds for train INFO - 2022-10-20 13:25:57,425 - semantickitti - Found 4071 pointclouds for validation INFO - 2022-10-20 13:25:57,677 - semantic_segmentation - Initializing from scratch. INFO - 2022-10-20 13:25:57,678 - semantic_segmentation - Writing summary in train_log/00008_RandLANet_SemanticKITTI_torch. INFO - 2022-10-20 13:25:57,678 - semantic_segmentation - Started training INFO - 2022-10-20 13:25:57,679 - semantic_segmentation - === EPOCH 0/100 === training: 0%| | 0/4783 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/sazid/Open3D-ML/scripts/run_pipeline.py", line 245, in
sys.exit(main())
File "/home/sazid/Open3D-ML/scripts/run_pipeline.py", line 179, in main
pipeline.run_train()
File "/home/sazid/miniconda3/envs/test/lib/python3.10/site-packages/open3d/_ml3d/torch/pipelines/semantic_segmentation.py", line 406, in run_train
for step, inputs in enumerate(tqdm(train_loader, desc='training')):
File "/home/sazid/miniconda3/envs/test/lib/python3.10/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/home/sazid/miniconda3/envs/test/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 444, in iter
return self._get_iterator()
File "/home/sazid/miniconda3/envs/test/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 390, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/home/sazid/miniconda3/envs/test/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1077, in init
w.start()
File "/home/sazid/miniconda3/envs/test/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/home/sazid/miniconda3/envs/test/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/sazid/miniconda3/envs/test/lib/python3.10/multiprocessing/context.py", line 300, in _Popen
return Popen(process_obj)
File "/home/sazid/miniconda3/envs/test/lib/python3.10/multiprocessing/popen_forkserver.py", line 35, in init
super().init(process_obj)
File "/home/sazid/miniconda3/envs/test/lib/python3.10/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/home/sazid/miniconda3/envs/test/lib/python3.10/multiprocessing/popen_forkserver.py", line 47, in _launch
reduction.dump(process_obj, buf)
File "/home/sazid/miniconda3/envs/test/lib/python3.10/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'SemSegRandomSampler.get_point_sampler.._random_centered_gen'
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
Expected behavior
No response
Open3D, Python and System information
Additional information
No response