cypw / PyTorch-MFNet

MIT License
252 stars 56 forks source link

Augmentations on GPU #8

Open jhagege opened 6 years ago

jhagege commented 6 years ago

Hi, great code ! I have been noticing GPU usage is a bit low (around 40%), and trying to optimize. I've been noticing that HLSTransform is very CPU intensive. Are you aware of any way to have it executed on GPU instead of CPU ? Do you think it could help ? Thanks

cypw commented 6 years ago

I haven't found any HLS implementation on GPU. It might be helpful if the color augmentation could be done on the GPU side.

Besides considering reducing the cost of data augmentation, you can also consider reducing the cost of decoding video files. Actually, for Kinetics dataset, I found that convert the default *.mp4 using the command below can significantlty speed up the decoding stage:

For example:

ffmpeg -y -i ${SRC_VID} -c:v mpeg4 -filter:v "scale=min(iw\,(256*iw)/min(iw\,ih)):-1" -b:v 512k -an ${DST_VID}
jhagege commented 6 years ago

Thanks much for your feedback, this is helpful. Will give it a look.

jhagege commented 6 years ago

@cypw By the way, did you try converting videos to h264 / h265 ? Did you notice a significant improvement with mpeg4 compared to those ? Thanks !

georkap commented 4 years ago

Hi, this comes a bit late but removing numpy functions as much as possible and using cv2 equivalents in the __call__ function in the RandomHLS augmentation saves significant cpu processing time. Essentially, substituting the np.minimum and np.maximum. Snippet below, hope it helps.

def __call__(self, data):
    assert data.ndim == 3, 'cannot operate on a single channel'
    h, w, c = data.shape
    assert c % 3 == 0, "input channel = %d, illegal" % c
    num_ims = c//3

    random_vars = tuple(int(round(self.rng.uniform(-x, x))) for x in (self.vars + [0]))
    augmented_data = np.zeros(data.shape, dtype=np.uint8)

    for i_im in range(0, num_ims): # for every image do the magic
        start, end = 3*i_im, 3*(i_im+1)
        augmented_data[:, :, start:end] = cv2.cvtColor(data[:, :, start:end], cv2.COLOR_RGB2HLS)
        augmented_data[:, :, start:end] = cv2.add(augmented_data[:, :, start:end], random_vars, dtype=cv2.CV_8UC3)
        mask = cv2.inRange(augmented_data[:, :, start], 0, 180)
        augmented_data[mask == 0, start] = 180
        augmented_data[:, :, start:end] = cv2.cvtColor(augmented_data[:, :, start:end], cv2.COLOR_HLS2RGB)

    return augmented_data