Closed small-zeng closed 2 months ago
Hi, I'm facing the same error as well, have you found any fix for this by any chance?
2024-04-07 08:06:42: (Active Mapping process) Process starts!!! (PID=27332)
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:558: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
0it [00:00, ?it/s]2024-04-07 08:06:59: (Inactive Mapping process) Process starts!!! (PID=27379)
0it [00:20, ?it/s]
Traceback (most recent call last):
File "/content/drive/MyDrive/MIPSFusionTrial/MIPSFusion/main.py", line 20, in <module>
slam.run()
File "/content/drive/MyDrive/MIPSFusionTrial/MIPSFusion/mipsfusion.py", line 674, in run
for i, batch in tqdm( enumerate(data_loader) ):
File "/usr/local/lib/python3.10/dist-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 631, in __next__
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 722, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/drive/MyDrive/MIPSFusionTrial/MIPSFusion/datasets/dataset.py", line 311, in __getitem__
"c2w": self.poses[index],
IndexError: list index out of range
Exception ignored in atexit callback: <function _exit_function at 0x7a2744559ea0>
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/util.py", line 357, in _exit_function
p.join()
File "/usr/lib/python3.10/multiprocessing/process.py", line 149, in join
res = self._popen.wait(timeout)
File "/usr/lib/python3.10/multiprocessing/popen_fork.py", line 43, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
File "/usr/lib/python3.10/multiprocessing/popen_fork.py", line 27, in poll
pid, sts = os.waitpid(self.pid, flag)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 27392) is killed by signal: Terminated.
[W CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
Maybe you only downloaded raw data of FastCaMo-synth dataset, and didn't download GT poses, which I also provided in README.md, so the dataloader cannot find GT poses and reports this error(even though GT pose is only used for evaluation). The correct approach should be download both raw data and GT poses, and put them together according to instructions in the latter link. That's because the original author of this dataset had separated these two parts into two compressed files.
It was my mistake not to explain this additionally. To prevent such inconvenience from happening again, I re-uploaded the dataset, putting raw data and GT poses together (only need to download 1 file). I also updated the README file at the same time.
Thanks a lot for the response! Will try it out.
Update: It works with the new dataset!
Thanks for your nice work. When I run it, this occur, how can I solve it?
python main.py --config configs/FastCaMo-synth/room_0.yaml
2023-12-18 17:22:22: (Active Mapping process) Process starts!!! (PID=36528) 2023-12-18 17:22:22: (Inactive Mapping process) Process starts!!! (PID=36749) 0it [00:00, ?it/s]0 torch.Size([460, 620, 3]) torch.Size([460, 620]) 0it [00:03, ?it/s] Traceback (most recent call last): File "main.py", line 20, in
slam.run()
File "/remote-home/zengjing/Projects/MIPSFusion/mipsfusion.py", line 674, in run
for i, batch in tqdm( enumerate(data_loader) ):
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/tqdm/std.py", line 1182, in iter
for obj in iterable:
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/remote-home/zengjing/Projects/MIPSFusion/datasets/dataset.py", line 313, in getitem
"c2w": self.poses[index],
IndexError: list index out of range
Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/root/anaconda3/envs/sdfstudio/lib/python3.8/multiprocessing/popen_fork.py", line 27, in poll pid, sts = os.waitpid(self.pid, flag) File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 36936) is killed by signal: Terminated. [W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]