yjtang249 / MIPSFusion

[SIGGRAPH Asia 2023] MIPSFusion is a neural SLAM method based on multi-implicit-submap representation for scalable online RGB-D reconstruction.
GNU General Public License v3.0
34 stars 7 forks source link

IndexError: list index out of range #1

Closed small-zeng closed 2 months ago

small-zeng commented 6 months ago

Thanks for your nice work. When I run it, this occur, how can I solve it?

python main.py --config configs/FastCaMo-synth/room_0.yaml

2023-12-18 17:22:22: (Active Mapping process) Process starts!!! (PID=36528) 2023-12-18 17:22:22: (Inactive Mapping process) Process starts!!! (PID=36749) 0it [00:00, ?it/s]0 torch.Size([460, 620, 3]) torch.Size([460, 620]) 0it [00:03, ?it/s] Traceback (most recent call last): File "main.py", line 20, in slam.run() File "/remote-home/zengjing/Projects/MIPSFusion/mipsfusion.py", line 674, in run for i, batch in tqdm( enumerate(data_loader) ): File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/tqdm/std.py", line 1182, in iter for obj in iterable: File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in next data = self._next_data() File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data return self._process_data(data) File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data data.reraise() File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/remote-home/zengjing/Projects/MIPSFusion/datasets/dataset.py", line 313, in getitem "c2w": self.poses[index], IndexError: list index out of range

Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/root/anaconda3/envs/sdfstudio/lib/python3.8/multiprocessing/popen_fork.py", line 27, in poll pid, sts = os.waitpid(self.pid, flag) File "/root/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 36936) is killed by signal: Terminated. [W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

vijay-jaisankar commented 3 months ago

Hi, I'm facing the same error as well, have you found any fix for this by any chance?

Stack Trace

2024-04-07 08:06:42: (Active Mapping process) Process starts!!! (PID=27332)
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:558: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(
0it [00:00, ?it/s]2024-04-07 08:06:59: (Inactive Mapping process) Process starts!!! (PID=27379)
0it [00:20, ?it/s]
Traceback (most recent call last):
  File "/content/drive/MyDrive/MIPSFusionTrial/MIPSFusion/main.py", line 20, in <module>
    slam.run()
  File "/content/drive/MyDrive/MIPSFusionTrial/MIPSFusion/mipsfusion.py", line 674, in run
    for i, batch in tqdm( enumerate(data_loader) ):
  File "/usr/local/lib/python3.10/dist-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 722, in reraise
    raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/drive/MyDrive/MIPSFusionTrial/MIPSFusion/datasets/dataset.py", line 311, in __getitem__
    "c2w": self.poses[index],
IndexError: list index out of range

Exception ignored in atexit callback: <function _exit_function at 0x7a2744559ea0>
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/util.py", line 357, in _exit_function
    p.join()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 149, in join
    res = self._popen.wait(timeout)
  File "/usr/lib/python3.10/multiprocessing/popen_fork.py", line 43, in wait
    return self.poll(os.WNOHANG if timeout == 0.0 else 0)
  File "/usr/lib/python3.10/multiprocessing/popen_fork.py", line 27, in poll
    pid, sts = os.waitpid(self.pid, flag)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 27392) is killed by signal: Terminated. 
[W CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
yjtang249 commented 2 months ago

Maybe you only downloaded raw data of FastCaMo-synth dataset, and didn't download GT poses, which I also provided in README.md, so the dataloader cannot find GT poses and reports this error(even though GT pose is only used for evaluation). The correct approach should be download both raw data and GT poses, and put them together according to instructions in the latter link. That's because the original author of this dataset had separated these two parts into two compressed files.

It was my mistake not to explain this additionally. To prevent such inconvenience from happening again, I re-uploaded the dataset, putting raw data and GT poses together (only need to download 1 file). I also updated the README file at the same time.

vijay-jaisankar commented 2 months ago

Thanks a lot for the response! Will try it out.

Update: It works with the new dataset!