dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
5.03k stars 1.32k forks source link

KeyError: Caught KeyError in DataLoader worker process 0. #693

Open udkii opened 3 years ago

udkii commented 3 years ago

Hi . I have two errors, and I'm trying to train my custom dataset.

1) KeyError: Caught KeyError in DataLoader worker process 0. 2) KeyError: 7

I don't know why, and I attached my errors below.

Scaling parameters by 0.62 to account for a batch size of 5. Per-GPU batch size is less than the recommended limit for batch norm. Disabling batch norm. loading annotations into memory... Done (t=0.00s) creating index... index created! loading annotations into memory... Done (t=0.00s) creating index... index created! /home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/jit/_recursive.py:222: UserWarning: 'lat_layers' was found in ScriptModule constants, but it is a non-constant submodule. Consider removing it. " but it is a non-constant {}. Consider removing it.".format(name, hint)) /home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/jit/_recursive.py:222: UserWarning: 'pred_layers' was found in ScriptModule constants, but it is a non-constant submodule. Consider removing it. " but it is a non-constant {}. Consider removing it.".format(name, hint)) /home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/jit/_recursive.py:222: UserWarning: 'downsample_layers' was found in ScriptModule constants, but it is a non-constant submodule. Consider removing it. " but it is a non-constant {}. Consider removing it.".format(name, hint)) Initializing weights... /home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /opt/conda/conda-bld/pytorch_1623448265233/work/c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) Begin training!

/home/aim/yolact/utils/augmentations.py:309: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray mode = random.choice(self.sample_options) /home/aim/yolact/utils/augmentations.py:309: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray mode = random.choice(self.sample_options) /home/aim/yolact/utils/augmentations.py:309: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray mode = random.choice(self.sample_options) /home/aim/yolact/utils/augmentations.py:309: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray mode = random.choice(self.sample_options) Traceback (most recent call last): File "train.py", line 504, in train() File "train.py", line 270, in train for datum in data_loader: File "/home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/_utils.py", line 425, in reraise raise self.exc_type(msg) KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/aim/anaconda3/envs/yolact-env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/aim/yolact/data/coco.py", line 94, in getitem im, gt, masks, h, w, num_crowds = self.pull_item(index) File "/home/aim/yolact/data/coco.py", line 153, in pull_item target = self.target_transform(target, width, height) File "/home/aim/yolact/data/coco.py", line 42, in call label_idx = self.label_map[label_idx] - 1 KeyError: 7

And I changed two parts in config.py

----------------------- DATASETS -----------------------

...

scaffold_dataset = dataset_base.copy({ 'name': 'Scaffold',

'train_images': './data/train/images/',
'valid_images': './data/train/images/',

'train_info': './data/train/trainval.json',
'valid_info': './data/train/trainval.json',

'class_names': ('guard', 'platform', 'vertical', 'stairs', 'basejack'),
'label_map' : {0 : 1, 1:2, 2:3, 3:4, 4:5}

})

----------------------- YOLACT v1.0 CONFIGS -----------------------

... yolact_resnet50_scaffold_config = yolact_resnet50_config.copy({ 'name': 'yolact_plus_resnet50_scaffold', # Will default to yolact_resnet50_pascal

# Dataset stuff
'dataset': scaffold_dataset,
'num_classes': len(scaffold_dataset.class_names) + 1,

'max_size' : 512,
'max_iter': 120000,
'lr_steps': (60000, 100000),

'backbone': yolact_resnet50_config.backbone.copy({
    'pred_scales': [[32], [64], [128], [256], [512]],
    'use_square_anchors': False,
})

})

and Also I changed 'shuffle = True ' to 'shuffle = False' in train.py. Because of cuda error. I referred to issues of others.

Kentzhuyi commented 2 years ago

Have you settle down this problem? I got the same trouble.

Malajiechi commented 1 year ago

Have you settle down this problem? I got the same trouble.