Closed haruka0000 closed 4 years ago
I added "print" to that error line.
lib/datasets/layout_coco.py
def boxes2indices(self, boxes):
print(boxes.shape) # This line added
coord_inds = self.loc_map.coords2indices(boxes[:,:2])
trans_inds = self.trans_map.whs2indices(boxes[:,2:])
out_inds = np.concatenate([coord_inds.reshape((-1, 1)), trans_inds], -1)
return out_inds
And, I checked output. There is a different shape. Maybe, that training data of layout COCO contains wrong data?
0 14
(4, 4)
(6, 4)
(2, 4)
(9, 4)
(2, 4)
(8, 4)
(9, 4)
(24, 4)
(3, 4)
(4, 4)
(0,) # This
I didn't check which data yet.
Hi haruka0000, it seems the model predicts no box in this case. I will double check the codes and get back to you by this weekend.
Thank you for your reply. Okay, I'll wait.
I fixed a bit like below. After that, it is training well. But not finished yet. lib/datasets/layout_coco.py line number 134
# scenedb = [self.load_coco_annotation(index) for index in self.image_index]
"""
Exclude boxes with zero dim.
"""
scenedb = []
for index in self.image_index:
if self.load_coco_annotation(index)['boxes'].shape != np.array([]).shape:
scenedb.append(self.load_coco_annotation(index))
It can train scenes having boxes temporarily. However, it can't use all exiting data.
I'm sorry to post a lot of comments.
The error is fixed. Basically it is because some of the images have no box annotations. I actually remove all of these cases for the training and validation splits. But I want to keep such images in the test set so that we can evaluate all images in the testset. The error is in the training script where in my old codes, the val2017 split of COCO was named "val" but was actually used as "test" set.
Also, for the layout generation model, the default training epoch number is 50. Actually the model will converge in fewer epochs, so 50 is actually the upper bound. You may want to check the validation accuracy for each epoch and pick the one with the best val performance. You can also speed up the training using multiple GPUs, by adding --parallel
in the training script.
Thank you for your answering. I understood why that error didn't happen before. I'll try to train with fixed scripts and epoch 50.
There is like below error after epoch 000 learned. I followed README however I couldn't. Are other libraries' vesions different. For example COCOAPI, numpy, opencv?
These are libraries' versions I uesd.