Closed Steffgroe closed 5 years ago
Fixed by the following code: ` def _load_pascal_annotation(self, index): """ Load image and bounding boxes info from XML file in the PASCAL VOC format. """ filename = os.path.join(self._data_path, 'Annotations', index + '.xml') tree = ET.parse(filename) objs = tree.findall('object') if not self.config['use_diff']:
non_diff_objs = [
obj for obj in objs if int(obj.find('difficult').text) == 0]
# if len(non_diff_objs) != len(objs):
# print 'Removed {} difficult objects'.format(
# len(objs) - len(non_diff_objs))
objs = non_diff_objs
cls_objs = [obj for obj in objs if obj.find('name').text in self._classes]
objs = cls_objs
num_objs = len(objs)
boxes = np.zeros((num_objs, 4), dtype=np.uint16)
gt_classes = np.zeros((num_objs), dtype=np.int32)
overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
# "Seg" area for pascal is just the box area
seg_areas = np.zeros((num_objs), dtype=np.float32)
# Load object bounding boxes into a data frame.
for ix, obj in enumerate(objs):
bbox = obj.find('bndbox')
# Make pixel indexes 0-based
x1 = float(bbox.find('xmin').text) - 1
y1 = float(bbox.find('ymin').text) - 1
x2 = float(bbox.find('xmax').text) - 1
y2 = float(bbox.find('ymax').text) - 1
cls = self._class_to_ind[obj.find('name').text.lower().strip()]
boxes[ix, :] = [x1, y1, x2, y2]
gt_classes[ix] = cls
overlaps[ix, cls] = 1.0
seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)
overlaps = scipy.sparse.csr_matrix(overlaps)
return {'boxes': boxes,
'gt_classes': gt_classes,
'gt_overlaps': overlaps,
'flipped': False,
'seg_areas': seg_areas}`
Hey @Steffgroe , I want to ask whether you succeeded to detect only 'person' and 'background' based on the existing VOC data. I fixed my code as your code above and finished my training. However, in demo the model detected cats and dogs as 'person' with very high accuracy. Do you know why?
Hi, I am playing around for some time with this implementation of faster rcnn. I succeeded to train on the INRIA persons data set myself, however I still can't make this implementation classify persons and background using the pascal voc data. I made the following files for this: When I try to execute the code after adding the data set to the script I get the following error: Preparing training data... Traceback (most recent call last): File "./tools/trainval_net.py", line 105, in
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/trainval_net.py", line 76, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/trainval_net.py", line 73, in get_roidb
roidb = get_training_roidb(imdb)
File tf-faster-rcnn-master/tools/../lib/model/train_val.py", line 332, in get_training_roidb
rdl_roidb.prepare_roidb(imdb)
File "tf-faster-rcnn-master/tools/../lib/roi_data_layer/roidb.py", line 49, in prepare_roidb
assert all(max_classes[nonzero_inds] != 0)
AssertionError
Command exited with non-zero status 1
How can I change the code to make faster rcnn only detect the person class using the pascal voc data set provided? pascal_voc_person.zip