jbohnslav / deepethogram

Other
98 stars 32 forks source link

IndexError: index 241900 is out of bounds for axis 0 with size 241226 #126

Closed hummuscience closed 1 year ago

hummuscience commented 1 year ago

This seems to be a recurring error, and I am not sure what is causing it. I have had it happen multiple times before but somehow restarting from scratch would solve it.

It appears when I try to train the feature extractor. The flow generator works fine, though. It seems to have to do with the size of the video used for training (in this case it is a large video).

Lowering the batch size all the way to 1 leads to a few frames being loaded and then the same error appearing. The machine I am running this on has 500+ Gb of RAM and an RTX A6000 GPU (which has 48Gb of RAM) so it should be able to handle that.

I am also linking other issues with similar error:

https://github.com/jbohnslav/deepethogram/issues/102

https://github.com/jbohnslav/deepethogram/issues/24

IndexError: Caught IndexError in DataLoader worker process 7.
Original Traceback (most recent call last):
  File "/gs/home/abdelhaym/.conda/envs_ppc/deg/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/gs/home/abdelhaym/.conda/envs_ppc/deg/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/gs/home/abdelhaym/.conda/envs_ppc/deg/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/gs/home/abdelhaym/.conda/envs_ppc/deg/lib/python3.7/site-packages/deepethogram/data/datasets.py", line 415, in __getitem__
    return self.dataset[index]
  File "/gs/home/abdelhaym/.conda/envs_ppc/deg/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 235, in __getitem__
    return self.datasets[dataset_idx][sample_idx]
  File "/gs/home/abdelhaym/.conda/envs_ppc/deg/lib/python3.7/site-packages/deepethogram/data/datasets.py", line 368, in __getitem__
    label = self.labels[index]
IndexError: index 241900 is out of bounds for axis 0 with size 241226
hummuscience commented 1 year ago

This issue shows up when the number of rows in the label file does not match the length of the video. Opening the labels file in the GUI works fine and doesn't give any errors, even though the lengths do not align.

kylethieringer commented 1 year ago

@Cumol did you find a fix for this? Im working with large videos on a similarly powerful machine. Other issues mentioned reducing batch size but wondering if you found a different fix?

joshghimire commented 2 months ago

@kylethieringer Did you end up figuring this out by chance? Also running into the same issue.