broadinstitute / keras-rcnn

Keras package for region-based convolutional neural networks (RCNNs)
Other
553 stars 223 forks source link

Bug fix to bounding box generator #60

Closed yhenon closed 6 years ago

yhenon commented 6 years ago

Small fix in the object detection generator. ds is a list of dicts, for example: [{u'x2': 151, u'y1': 299, u'x1': 34, u'y2': 422, u'class': u'rbc'}, ...]

enumerate(d) has no guarantee on the order if d is a dict, so [d[k] for i, k in enumerate(d) if i > 0] can return any 4 of the keys in the dict. In my case it looked like this before the fix:

(Pdb) boxes
array([[u'299', u'34', u'422', u'rbc'],
       [u'459', u'718', u'573', u'rbc'],
       [u'235', u'629', u'343', u'rbc'],
       [u'1024', u'888', u'1137', u'rbc'],
       [u'288', u'907', u'392', u'rbc'],
       [u'963', u'637', u'1082', u'rbc'], (truncated)

After:

(Pdb) boxes
array([[ 660,  757,  998, 1091],
       [ 706,  807,  147,  244],
       [ 755,  867,  749,  862],
       [ 486,  607,  232,  363],
       [1073, 1188,  145,  248],
       [ 865,  970,  270,  371], (truncated)

This ['y1','x1','y2','x2'] is the order specified in tensorflow.non_maximum_suppression ('Bounding boxes are supplied as [y1, x1, y2, x2]'), but I'm not 100% sure it's what we want to use here.

0x00b1 commented 6 years ago

Whoa! Nice catch, @yhenon!

I didn't realize tensorflow.image.non_max_supression expects [y1, x1, y2, x2]. Strange. Do you have an opinion on what should be the format used by keras_rcnn.backend.non_max_supression? I would expect [x1, y1, x2, y2].

yhenon commented 6 years ago

@0x00b1 yeah I thought it seemed strange too. I suppose that it matches the order of images in memory, which is (rows, cols, channels) usually (for row-major implementations like numpy), so y followed by x.

I feel like [x1, y1, x2, y2] is much more common and intuitive though.