Open wzh506 opened 3 months ago
Hi, From this: https://github.com/YosefLab/Compass/issues/46
This typically happens when some other sort of exception is being thrown from within the subprocess. The exception handler for this original exception then tries to pickle some objects to report back, and from here the second error happens. I believe in the past I've been able to diagnose by running things in a single-threaded manner to see what the underlying cause is.
So likely the error can be found by trying to run the code in a single process instead.
Thank you for your great work!I have a little question: When I run run.py, p.start() in slam.run() will find:
rank, self.time_string, t_pipe 0 20240828_164627 <multiprocessing.connection.Connection object at 0x7fd425d5f970> Traceback (most recent call last): File "/data/wangzhaohui/github/Point-SLAM/run.py", line 46, in <module> main() File "/data/wangzhaohui/github/Point-SLAM/run.py", line 42, in main slam.run() File "/data/wangzhaohui/github/Point-SLAM/src/Point_SLAM.py", line 209, in run p.start() File "/data/wangzhaohui/anaconda3/envs/point-slam/lib/python3.10/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/data/wangzhaohui/anaconda3/envs/point-slam/lib/python3.10/multiprocessing/context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/data/wangzhaohui/anaconda3/envs/point-slam/lib/python3.10/multiprocessing/context.py", line 288, in _Popen return Popen(process_obj) File "/data/wangzhaohui/anaconda3/envs/point-slam/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__ super().__init__(process_obj) File "/data/wangzhaohui/anaconda3/envs/point-slam/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__ self._launch(process_obj) File "/data/wangzhaohui/anaconda3/envs/point-slam/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/data/wangzhaohui/anaconda3/envs/point-slam/lib/python3.10/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) TypeError: cannot pickle 'SwigPyObject' object [W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
Test on Linux,4090 GPU