TropComplique / mtcnn-pytorch

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks
MIT License
659 stars 161 forks source link

Avoid calling `np.asarray` multiple times #9

Open vladserkoff opened 6 years ago

vladserkoff commented 6 years ago

Hi! First, thanks for the implementation, nice work! I've that in get_image_boxes PIL.Image is casted to np.array multiple times for every bounding box found. AFAIK, the fix doesn't break anything, results look the same. Just for the perspective, I've tested on a large image (around 20MB in size):

Timer unit: 1 s

Total time: 13.8626 s
Function: get_image_boxes at line 127

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   127                                           def get_image_boxes(bounding_boxes, img, size=24):
   128                                               """Cut out boxes from the image.
   129                                           
   130                                               Arguments:
   131                                                   bounding_boxes: a float numpy array of shape [n, 5].
   132                                                   img: an instance of PIL.Image.
   133                                                   size: an integer, size of cutouts.
   134                                           
   135                                               Returns:
   136                                                   a float numpy array of shape [n, 3, size, size].
   137                                               """
   138         2          0.0      0.0      0.0      num_boxes = len(bounding_boxes)
   139         2          0.0      0.0      0.0      width, height = img.size
   140                                           
   141         2          0.0      0.0      0.0      [dy, edy, dx, edx, y, ey, x, ex, w, h] = correct_bboxes(bounding_boxes, width, height)
   142         2          0.0      0.0      0.0      img_boxes = np.zeros((num_boxes, 3, size, size), 'float32')
   143                                           
   144       397          0.0      0.0      0.0      for i in range(num_boxes):
   145       395         13.7      0.0     99.0          img_array = np.asarray(img, 'uint8')
   146       395          0.0      0.0      0.1          img_box = np.zeros((h[i], w[i], 3), 'uint8')
   147                                                   img_box[dy[i]:(edy[i] + 1), dx[i]:(edx[i] + 1), :] =\
   148       395          0.0      0.0      0.2              img_array[y[i]:(ey[i] + 1), x[i]:(ex[i] + 1), :]
   149                                           
   150                                                   # resize
   151       395          0.0      0.0      0.3          img_box = Image.fromarray(img_box)
   152       395          0.0      0.0      0.1          img_box = img_box.resize((size, size), Image.BILINEAR)
   153       395          0.0      0.0      0.1          img_box = np.asarray(img_box, 'float32')
   154                                           
   155       395          0.0      0.0      0.1          img_boxes[i, :, :, :] = _preprocess(img_box)
   156                                           
   157         2          0.0      0.0      0.0      return img_boxes

With the fix, total time is reduces to ~1 second.