preproc() in data_augment.py is slow

Hi.

I measured the execution time of tools/demo.py of bytetrack_nano on my images:

python3 -m cProfile -o output.prof tools/demo_track.py image --path images -f exps/example/mot/yolox_nano_mix_det.py -c pretrained/bytetrack_nano_mot17.pth.tar --fp16 --fuse --save_result

And found that data_aug.py is the bottleneck.

I would like to know how to speed up data_aug.py. Any suggestions?

By the way, I found that the execution time of data_aug.py is not dominant in the benchmark of YOLOX by itself. As I see it, the difference with YOLOX is as follows.

def preproc(image, input_size, mean, std, swap=(2, 0, 1)):
    if len(image.shape) == 3:
        padded_img = np.ones((input_size[0], input_size[1], 3)) * 114.0
    else:
        padded_img = np.ones(input_size) * 114.0

    img = np.array(image)    # ==== HERE ====
    r = min(input_size[0] / img.shape[0], input_size[1] / img.shape[1])

    resized_img = cv2.resize(
        img,
        (int(img.shape[1] * r), int(img.shape[0] * r)),
        interpolation=cv2.INTER_LINEAR,
    ).astype(np.float32)    # ==== HERE ====
    padded_img[: int(img.shape[0] * r), : int(img.shape[1] * r)] = resized_img

    # ==== HERE ====
    padded_img = padded_img[:, :, ::-1]
    padded_img /= 255.0
    if mean is not None:
        padded_img -= mean
    if std is not None:
        padded_img /= std
    # ==== END ====

    padded_img = padded_img.transpose(swap)
    padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)
    return padded_img, r

ifzhang / ByteTrack

preproc() in data_augment.py is slow #322