Feature: support cropping images using inferred bounding boxes

gwern commented 4 years ago

It would be good if the provided script could crop images down to detected figures/faces instead of just providing JSON & visualizations. For the purpose of adding data augmentation to our Danbooru2019 BigGAN to help it learn solo figures (we already have a cropped portrait dataset, which improved learning of faces noticeably), I added a pass to infer_from_image.py which looks like this:

    if len((result["detection_score"]))==1 and (result["detection_score"])[0] > FLAGS.min_score_thresh:
      # result = {'detection_score': [0.9958623647689819], 'detection_bbox_ymin': [0.11348748952150345], 'detection_bbox_xmin': [0.6218132972717285], 'detection_bbox_ymax': [0.3206212520599365], 'detection_bbox_xmax': [0.8703262805938721], 'detection_class_label': [1], 'annotated_image': array([[[255, 255, 255], ...
      base, ext = os.path.splitext(os.path.basename(image_path))
      output_crop = os.path.join(FLAGS.output_path, base + '_crop.png')
      idims = image_np.shape # np array with shape (height, width, num_color(1, 3, or 4))
      min_x = min(round(result["detection_bbox_xmin"][0] * idims[1]), idims[1])
      max_x = min(round(result["detection_bbox_xmax"][0] * idims[1]), idims[1])
      min_y = min(round(result["detection_bbox_ymin"][0] * idims[0]), idims[0])
      max_y = min(round(result["detection_bbox_ymax"][0] * idims[0]), idims[0])
      image_cropped = image_np[min_y:max_y, min_x:max_x, :]

A cleaned-up and configurable version which crop out each bounding box would be a good addition.

On a side note, it'd be nice if this would use both my GPUs. I'm also not sure this is properly minibatching: it seems a lot slower than I'd expect, and the GPU utilization in nvidia-smi is a lot bouncier and usually <100%.

jerryli27 commented 4 years ago

FYI I decided to have a more general solution and provided two flags to support saving either just 1 cropped object or all objects. The file name is thus a bit different from the script you provided.

gwern commented 4 years ago

Sounds good to me. Cropping just one object suited my usecase, but others may prefer to crop out every possibility and clean by hand or something.

jerryli27 / AniSeg

Feature: support cropping images using inferred bounding boxes #6