dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
5.03k stars 1.32k forks source link

Custom dataset runtime error #44

Closed JulienFleuret closed 5 years ago

JulienFleuret commented 5 years ago


I am trying to retrain yolact on Pascal Part a variation of Pascal VOC where each classes has many sub-classes. To simplify everything I make every sub-classes a class in addition with the 20 original one which give me a set 316 classes. I generated three JSON files for each case.

When I start training I encouter the following error: RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity Which happen here: losses = criterion(out, wrapper, wrapper.make_mask()) train.py around line 262 (I had some print in my file so my line number is different)

Here: https://github.com/eriklindernoren/PyTorch-YOLOv3/issues/110

I read it might be a path issue however I rechecked the image path are correct. Also I am able to train Pascal Voc using the same image path without issues.

I try to investigate the forward method of the loss function looking for an empty tensor but I did not find any.

dbolya commented 5 years ago

Hmm it'd be helpful if you could get the exact line that the error happens on (inside of the loss forward function), but for now can you send over your dataset and model configs? Maybe there's an issue with those.

JulienFleuret commented 5 years ago


From the file train.py it happen line 256. It is after the declaration of the variable criterion at line 163, exactly its happen when the forward method of the class MultiBoxLoss is implicitly called.

In my download I put several print in order to investigate that is why I do not have the same index of line than the origininal file.

dbolya commented 5 years ago

I meant the line in multi_box.py, but it looks like the stack trace isn't printing that line out for you? The only call to max that I can see that would get triggered in multi_box.py is when computing the semantic segmentation loss--i.e., would have to do with classes.

Can you send over your dataset and model configs so I can check if you set up the classes properly?

JulienFleuret commented 5 years ago

I'll try to investigate more deeply the structuratoin of the dataset.

JulienFleuret commented 5 years ago

I just realized I misread you last response. The issue appear in the function match from the class box_utils.py which is called in the method forward of the class Multibox: Here the error message:

(Note the prints after begin training are mine)

%run train.py --config=yolact_base_config
loading annotations into memory...
Done (t=4.22s)
creating index...
index created!
loading annotations into memory...
Done (t=4.58s)
creating index...
index created!
Initializing weights...
Begin training!

<class 'dict'> dict_keys(['loc', 'conf', 'mask', 'priors', 'proto', 'segm']) <class '__main__.ScatterWrapper'> <class 'torch.Tensor'> torch.Size([8])
BEFORE MATCH:  0.5 0.4 torch.Size([0, 4]) torch.Size([19248, 4]) 0
BOX UTILS: torch.Size([0, 19248])
BEFORE MATCH:  0.5 0.4 torch.Size([0, 4]) torch.Size([19248, 4]) 0
BOX UTILS: torch.Size([0, 19248])
RuntimeError                              Traceback (most recent call last)
/hdd1/prog/yolact/train.py in <module>()
    520                 tmask = wrapper.make_mask()
    521                 print(type(out), out.keys(), type(wrapper), type(wrapper.make_mask()), tmask.shape)
--> 522                 losses = criterion(out, wrapper, wrapper.make_mask())
    524                 losses = { k: v.mean() for k,v in losses.items() } # Mean here because Dataparallel

~/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    487             result = self._slow_forward(*input, **kwargs)
    488         else:
--> 489             result = self.forward(*input, **kwargs)
    490         for hook in self._forward_hooks.values():
    491             hook_result = hook(self, input, result)

~/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
    141             return self.module(*inputs[0], **kwargs[0])
    142         replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
--> 143         outputs = self.parallel_apply(replicas, inputs, kwargs)
    144         return self.gather(outputs, self.output_device)

~/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py in parallel_apply(self, replicas, inputs, kwargs)
    152     def parallel_apply(self, replicas, inputs, kwargs):
--> 153         return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
    155     def gather(self, outputs, output_device):

~/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py in parallel_apply(modules, inputs, kwargs_tup, devices)
     81         output = results[i]
     82         if isinstance(output, Exception):
---> 83             raise output
     84         outputs.append(output)
     85     return outputs

~/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py in _worker(i, module, input, kwargs, device)
     57                 if not isinstance(input, (list, tuple)):
     58                     input = (input,)
---> 59                 output = module(*input, **kwargs)
     60             with lock:
     61                 results[i] = output

~/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    487             result = self._slow_forward(*input, **kwargs)
    488         else:
--> 489             result = self.forward(*input, **kwargs)
    490         for hook in self._forward_hooks.values():
    491             hook_result = hook(self, input, result)

/hdd1/prog/yolact/layers/modules/multibox_loss.py in forward(self, predictions, wrapper, wrapper_mask)
    138             match(self.pos_threshold, self.neg_threshold,
    139                   truths, defaults, labels[idx], crowd_boxes,
--> 140                   loc_t, conf_t, idx_t, idx, loc_data[idx])
    142             gt_box_t[idx, :, :] = truths[idx_t[idx]]

/hdd1/prog/yolact/layers/box_utils.py in match(pos_thresh, neg_thresh, truths, priors, labels, crowd_boxes, loc_t, conf_t, idx_t, idx, loc_data)
    139     # Size [num_objects] best prior for each ground truth
--> 140     best_prior_overlap, best_prior_idx = overlaps.max(1)
    141     # Size [num_priors] best ground truth for each prior
    142     best_truth_overlap, best_truth_idx = overlaps.max(0)

RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity

As visible in my prints I identify that some tensors are empty but I am still investigating the reason why they are empty. It seem to be linked to the JSON files however when I investigate it with pycocotools every thing seem normal in the sens that I did not encounter any error nor issues. I was able to work with the data access the mask and annotations and everything the way I used to do.

dbolya commented 5 years ago

Can you show me the dataset and model configs you set up in config.py? Looking at that line, it seems one of truths and decoded_priors is empty. If decoded_priors is empty that means something with your model config is wrong and the config produced no anchor boxes. If truths is empty, then that means something with your dataset isn't configured properly (truths should never be empty because I ignore images without GT).

I have a feeling this is a simple mistake in one of your configs (not the COCO annotations, but the config.py configs).

JulienFleuret commented 5 years ago

The dataset look like:

PASCAL_PART_CLASSES = ('aeroplane_body', 'aeroplane_engine_1',
                       'aeroplane_engine_2', 'aeroplane_engine_3',
                       'aeroplane_engine_4', 'aeroplane_engine_5',
                       'aeroplane_engine_6', 'aeroplane_lwing',
                       'aeroplane_rwing', 'aeroplane_stern',
                       'aeroplane_tail', 'aeroplane_wheel_1',
                       'aeroplane_wheel_2', 'aeroplane_wheel_3',
                       'aeroplane_wheel_4', 'aeroplane_wheel_5',
                       'aeroplane_wheel_6', 'aeroplane_wheel_7',
                       'aeroplane_wheel_8', 'aeroplane_whole',
                       'bicycle_bwheel', 'bicycle_chainwheel', 'bicycle_fwheel',
                       'bicycle_handlebar', 'bicycle_headlight_1',
                       'bicycle_saddle', 'bicycle_whole',
                       'bird_beak', 'bird_head', 'bird_leye', 'bird_lfoot',
                       'bird_lleg', 'bird_lwing', 'bird_neck', 'bird_reye',
                       'bird_rfoot', 'bird_rleg', 'bird_rwing', 'bird_tail',
                       'bird_torso', 'bird_whole', 'boat_whole', 'bottle_body',
                       'bottle_cap', 'bottle_whole', 'bus_backside',
                       'bus_bliplate', 'bus_door_1', 'bus_door_2', 'bus_door_3',
                       'bus_door_4', 'bus_fliplate', 'bus_frontside',
                       'bus_headlight_1', 'bus_headlight_2', 'bus_headlight_3',
                       'bus_headlight_4', 'bus_headlight_5', 'bus_headlight_6',
                       'bus_headlight_7', 'bus_headlight_8', 'bus_leftmirror',
                       'bus_leftside', 'bus_rightmirror', 'bus_rightside',
                       'bus_roofside', 'bus_wheel_1', 'bus_wheel_2',
                       'bus_wheel_3', 'bus_wheel_4', 'bus_wheel_5', 'bus_whole',
                       'bus_window_1', 'bus_window_10', 'bus_window_11',
                       'bus_window_12', 'bus_window_13', 'bus_window_14',
                       'bus_window_15', 'bus_window_16', 'bus_window_17',
                       'bus_window_18', 'bus_window_19', 'bus_window_2',
                       'bus_window_20', 'bus_window_3', 'bus_window_4',
                       'bus_window_5', 'bus_window_6', 'bus_window_7',
                       'bus_window_8', 'bus_window_9',
                       'car_backside', 'car_bliplate', 'car_door_1',
                       'car_door_2', 'car_door_3', 'car_fliplate',
                       'car_frontside', 'car_headlight_1', 'car_headlight_2',
                       'car_headlight_3', 'car_headlight_4', 'car_headlight_5',
                       'car_headlight_6', 'car_leftmirror', 'car_leftside',
                       'car_rightmirror', 'car_rightside', 'car_roofside',
                       'car_wheel_1', 'car_wheel_2', 'car_wheel_3',
                       'car_wheel_4', 'car_wheel_5', 'car_whole', 'car_window_1',
                       'car_window_2', 'car_window_3', 'car_window_4',
                       'car_window_5', 'car_window_6', 'car_window_7',
                       'cat_head', 'cat_lbleg', 'cat_lbpa', 'cat_lear',
                       'cat_leye', 'cat_lfleg', 'cat_lfpa', 'cat_neck',
                       'cat_nose', 'cat_rbleg', 'cat_rbpa', 'cat_rear',
                       'cat_reye', 'cat_rfleg', 'cat_rfpa', 'cat_tail',
                       'cat_torso', 'cat_whole', 'chair_whole', 'cow_head',
                       'cow_lblleg', 'cow_lbuleg', 'cow_lear', 'cow_leye',
                       'cow_lflleg', 'cow_lfuleg', 'cow_lhorn', 'cow_muzzle',
                       'cow_neck', 'cow_rblleg', 'cow_rbuleg', 'cow_rear',
                       'cow_reye', 'cow_rflleg', 'cow_rfuleg', 'cow_rhorn',
                       'cow_tail', 'cow_torso', 'cow_whole',
                       'dog_head', 'dog_lbleg', 'dog_lbpa', 'dog_lear',
                       'dog_leye', 'dog_lfleg', 'dog_lfpa', 'dog_muzzle',
                       'dog_neck', 'dog_nose', 'dog_rbleg', 'dog_rbpa',
                       'dog_rear', 'dog_reye', 'dog_rfleg', 'dog_rfpa',
                       'dog_tail', 'dog_torso', 'dog_whole',
                       'horse_head', 'horse_lbho', 'horse_lblleg', 'horse_lbuleg',
                       'horse_lear', 'horse_leye', 'horse_lfho', 'horse_lflleg',
                       'horse_lfuleg', 'horse_muzzle', 'horse_neck', 'horse_rbho',
                       'horse_rblleg', 'horse_rbuleg', 'horse_rear', 'horse_reye',
                       'horse_rfho', 'horse_rflleg', 'horse_rfuleg', 'horse_tail',
                       'horse_torso', 'horse_whole',
                       'motorbike_bwheel', 'motorbike_fwheel',
                       'motorbike_handlebar', 'motorbike_headlight_1',
                       'motorbike_headlight_2', 'motorbike_headlight_3',
                       'motorbike_saddle', 'motorbike_whole',
                       'person_hair', 'person_head', 'person_lear',
                       'person_lebrow', 'person_leye', 'person_lfoot',
                       'person_lhand', 'person_llarm', 'person_llleg',
                       'person_luarm', 'person_luleg', 'person_mouth',
                       'person_neck', 'person_nose', 'person_rear',
                       'person_rebrow', 'person_reye', 'person_rfoot',
                       'person_rhand', 'person_rlarm', 'person_rlleg',
                       'person_ruarm', 'person_ruleg', 'person_torso',
                       'pottedplant_plant', 'pottedplant_pot', 'pottedplant_whole',
                       'sheep_head', 'sheep_lblleg', 'sheep_lbuleg', 'sheep_lear',
                       'sheep_leye', 'sheep_lflleg', 'sheep_lfuleg', 'sheep_lhorn',
                       'sheep_muzzle', 'sheep_neck', 'sheep_rblleg', 'sheep_rbuleg',
                       'sheep_rear', 'sheep_reye', 'sheep_rflleg', 'sheep_rfuleg',
                       'sheep_rhorn', 'sheep_tail', 'sheep_torso', 'sheep_whole',
                       'train_cbackside_1', 'train_cbackside_2',
                       'train_cfrontside_1', 'train_cfrontside_2',
                       'train_cfrontside_3', 'train_cfrontside_4',
                       'train_cfrontside_5', 'train_cfrontside_6',
                       'train_cfrontside_7', 'train_cfrontside_9',
                       'train_cleftside_1', 'train_cleftside_2',
                       'train_cleftside_3', 'train_cleftside_4',
                       'train_cleftside_5', 'train_cleftside_6',
                       'train_cleftside_7', 'train_cleftside_8',
                       'train_cleftside_9', 'train_coach_1',
                       'train_coach_2', 'train_coach_3', 'train_coach_4',
                       'train_coach_5', 'train_coach_6', 'train_coach_7',
                       'train_coach_8', 'train_coach_9', 'train_crightside_1',
                       'train_crightside_2', 'train_crightside_3',
                       'train_crightside_4', 'train_crightside_5',
                       'train_crightside_6', 'train_crightside_7',
                       'train_crightside_8', 'train_croofside_1',
                       'train_croofside_2', 'train_croofside_3',
                       'train_croofside_4', 'train_croofside_5',
                       'train_hbackside', 'train_head', 'train_headlight_1',
                       'train_headlight_2', 'train_headlight_3',
                       'train_headlight_4', 'train_headlight_5',
                       'train_hfrontside', 'train_hleftside', 'train_hrightside',
                       'train_hroofside', 'train_whole',
                       'tvmonitor_screen', 'tvmonitor_whole'

PASCAL_PART_LABEL_MAP = {1:1, 2:2, 3:3, 4:4, 5:5, 6:6, 7:7, 8:8, 9:9, 10:10,
 11:11, 12:12, 13:13, 14:14, 15:15, 16:16, 17:17, 18:18, 19:19, 20:20, 21:21,
 22:22, 23:23, 24:24, 25:25, 26:26, 27:27, 28:28, 29:29, 30:30, 31:31, 32:32,
 33:33, 34:34, 35:35, 36:36, 37:37, 38:38, 39:39, 40:40, 41:41, 42:42, 43:43,
 44:44, 45:45, 46:46, 47:47, 48:48, 49:49, 50:50, 51:51, 52:52, 53:53, 54:54,
 55:55, 56:56, 57:57, 58:58, 59:59, 60:60, 61:61, 62:62, 63:63, 64:64, 65:65,
 66:66, 67:67, 68:68, 69:69, 70:70, 71:71, 72:72, 73:73, 74:74, 75:75, 76:76,
 77:77, 78:78, 79:79, 80:80, 81:81, 82:82, 83:83, 84:84, 85:85, 86:86, 87:87,
 88:88, 89:89, 90:90, 91:91, 92:92, 93:93, 94:94, 95:95, 96:96, 97:97, 98:98,
 99:99, 100:100, 101:101, 102:102, 103:103, 104:104, 105:105, 106:106, 107:107,
 108:108, 109:109, 110:110, 111:111, 112:112, 113:113, 114:114, 115:115, 116:116,
 117:117, 118:118, 119:119, 120:120, 121:121, 122:122, 123:123, 124:124, 125:125,
 126:126, 127:127, 128:128, 129:129, 130:130, 131:131, 132:132, 133:133, 134:134,
 135:135, 136:136, 137:137, 138:138, 139:139, 140:140, 141:141, 142:142, 143:143,
 144:144, 145:145, 146:146, 147:147, 148:148, 149:149, 150:150, 151:151, 152:152,
 153:153, 154:154, 155:155, 156:156, 157:157, 158:158, 159:159, 160:160, 161:161,
 162:162, 163:163, 164:164, 165:165, 166:166, 167:167, 168:168, 169:169, 170:170,
 171:171, 172:172, 173:173, 174:174, 175:175, 176:176, 177:177, 178:178, 179:179,
 180:180, 181:181, 182:182, 183:183, 184:184, 185:185, 186:186, 187:187, 188:188,
 189:189, 190:190, 191:191, 192:192, 193:193, 194:194, 195:195, 196:196, 197:197,
 198:198, 199:199, 200:200, 201:201, 202:202, 203:203, 204:204, 205:205, 206:206,
 207:207, 208:208, 209:209, 210:210, 211:211, 212:212, 213:213, 214:214, 215:215,
 216:216, 217:217, 218:218, 219:219, 220:220, 221:221, 222:222, 223:223, 224:224,
 225:225, 226:226, 227:227, 228:228, 229:229, 230:230, 231:231, 232:232, 233:233,
 234:234, 235:235, 236:236, 237:237, 238:238, 239:239, 240:240, 241:241, 242:242,
 243:243, 244:244, 245:245, 246:246, 247:247, 248:248, 249:249, 250:250, 251:251,
 252:252, 253:253, 254:254, 255:255, 256:256, 257:257, 258:258, 259:259, 260:260,
 261:261, 262:262, 263:263, 264:264, 265:265, 266:266, 267:267, 268:268, 269:269,
 270:270, 271:271, 272:272, 273:273, 274:274, 275:275, 276:276, 277:277, 278:278,
 279:279, 280:280, 281:281, 282:282, 283:283, 284:284, 285:285, 286:286, 287:287,
 288:288, 289:289, 290:290, 291:291, 292:292, 293:293, 294:294, 295:295, 296:296,
 297:297, 298:298, 299:299, 300:300, 301:301, 302:302, 303:303, 304:304, 305:305,
 306:306, 307:307, 308:308, 309:309, 310:310, 311:311, 312:312, 313:313, 314:314,
 315:315, 316:316}

pascalpart2012_dataset = dataset_base.copy({
    'name': 'PASCAL PART 2012',



    'label_map': PASCAL_PART_LABEL_MAP,
    'class_names': PASCAL_PART_CLASSES,


I did not recreate a model config specifically for this one. I evaluated with Pascal-Voc that just setting in yolact_base_config the fields dataset and num_classes refering to the dataset was enough.

Note the json files have been generate thank to: https://github.com/waspinator/pycococreator I evaluate the result using pycocotools and it works, I was able to access to the data, categories and annotations without issues.

I reached also during the week end that some issues should come from my settings. I am investigating it looking for whats wrong.

Just by curiosity what would happen if this error would be ignore ?

dbolya commented 5 years ago

Ignoring the error will just lead to another one down the line since something (and by looking at your settings I thin truths) is empty when it shouldn't be.

Is there any annotation in the dataset with segmentation but no bounding box or with a bounding box but no segmentation?

Or are there any images with "iscrowd" set to 1 for all annotations for that image?

JulienFleuret commented 5 years ago

I setted "iscrowd" to one for every annotation. My reason is because I have classes and subclasses in the initial annotations so the classes and subclasses are connex some are very close and may be slightly overlapped. In addition all overlap the category (i.e. dog_leye overlap dog_whole). For this reason I set the flag "iscrowd" to true for every annotation.

The original annotations where made from binary masks store in a set of mat files. So every annotation is made from a binary mask. But pycococreator in the process of creation of the json files as store the bounding box as well as the segmentations. I processed every file the same way and I try to paid attention to the two errors you mentioned but I may made a mistake. I actually regenerate the json files yesterday trying to recheck if an empty mask could have not been ignored. I am starting an investigation of the annotations and bounding box right now. If an empty mask has been process maybe I can find it first and then figure out whats happen.

I am also preparing to do the same thing using a micro dataset (5 categories, 20 images: 6 trains, 4 val, 10 tests) from COCO. Generate a json from the one of COCO make it work, generete 3 jsons files from the masks for each category check if it works.

dbolya commented 5 years ago

Well there's your problem. "iscrowd" tells COCO to ignore every detection that overlaps with it. It's not used for training, it's used to give the model a break when there's a crowd of people and each person isn't individually annotated.

Do not use iscrowd for data you actually want to train on. Like every other method and like COCO does in evaluation, I completely ignore all detections that overlap with an "iscrowd" GT.

JulienFleuret commented 5 years ago

Ok I did not understood that like this. I start the regeneration right now.

JulienFleuret commented 5 years ago

I have a great news it works !!! The issue came from the parameter iscrowd. Thanks very much for all the time you spent helping me really thank you very much.

davodogster commented 4 years ago

@JSharp4273 Hi Julien, I have a similar problem. All of my images have many overlapping instances.. so iscrowd=1 in the coco.json file. How can I train YOLACT with this data? I am able to train MaskRCNN with it but I would rather use YOLACT.

Regards, Sam @dbolya

EDIT: The solution was just find and replace all iscrowd: 1 to iscrowd: 0. I'm using the RLE format (not polygon). I also used waspinators function to generate the coco.json file.
