NVlabs / Deep_Object_Pose

Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)
Other
1.02k stars 287 forks source link

Training on custom dataset gives tensor size exception #139

Closed sebastian-ruiz closed 4 years ago

sebastian-ruiz commented 4 years ago

I have created a custom dataset of an object using NDDS. When I try and train on the dataset I run into the following error:

./scripts/train.py --data /home/sruiz/datasets/my_object_training_data --datatest /home/sruiz/datasets/my_object_test_data --object my_object --imagesize 512 --gpuids 2 3 4
start: 12:07:38.158108
load data
training data: 313 batches
testing data: 32 batches
load models
Training network pretrained on imagenet.
Traceback (most recent call last):
  File "./scripts/train.py", line 1392, in <module>
    _runnetwork(epoch,trainingdata)
  File "./scripts/train.py", line 1330, in _runnetwork
    for batch_idx, targets in enumerate(loader):
  File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data
    return self._process_data(data)
  File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
    data.reraise()
  File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "./scripts/train.py", line 678, in __getitem__
    affinities = torch.cat([affinities,torch.zeros(16,1,50)],dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 45 and 50 in dimension 2 (The offending index is 1)

Does anyone know why I am getting this error? Looking at other issues nobody seems to have the same error. Thank you very much!

TontonTremblay commented 4 years ago

Can you try imagine size 400. I think 512 is causing issues.

On Mon, Sep 21, 2020 at 03:13 Sebastian Ruiz notifications@github.com wrote:

I have created a custom dataset of an object using NDDS. When I try and train on the dataset I run into the following error:

./scripts/train.py --data /home/sruiz/datasets/my_object_training_data --datatest /home/sruiz/datasets/my_object_test_data --object my_object --imagesize 512 --gpuids 2 3 4 start: 12:07:38.158108 load data training data: 313 batches testing data: 32 batches load models Training network pretrained on imagenet. Traceback (most recent call last): File "./scripts/train.py", line 1392, in _runnetwork(epoch,trainingdata) File "./scripts/train.py", line 1330, in _runnetwork for batch_idx, targets in enumerate(loader): File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in next data = self._next_data() File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data return self._process_data(data) File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data data.reraise() File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/sruiz/miniconda3/envs/dope/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "./scripts/train.py", line 678, in getitem affinities = torch.cat([affinities,torch.zeros(16,1,50)],dim=1) RuntimeError: Sizes of tensors must match except in dimension 1. Got 45 and 50 in dimension 2 (The offending index is 1)

Does anyone know why I am getting this error? Looking at other issues nobody seems to have the same error. Thank you very much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NVlabs/Deep_Object_Pose/issues/139, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABK6JIBCXWYIG3N4PXMEEMLSG4RKZANCNFSM4RULDTDQ .

sebastian-ruiz commented 4 years ago

Thanks! This gets rid of this error.