yasenh / libtorch-yolov5

A LibTorch inference implementation of the yolov5
MIT License
372 stars 114 forks source link

Pre-processing step takes very long if processing several frames simultaneously #56

Closed alawliat closed 2 years ago

alawliat commented 2 years ago

Hi,

I am trying to run the detector on several videos (input is a batch of cv::Mat objects), and the bottleneck here is the preprocessing step, which is computationally heavy in its current form (frame clone, converting between rgb and bgr, concatenating, etc...) Are there any suggestions on how to optimize this step for batches?

I vaguely went over the python implementation and I think they speed up computation by doing vectorization tricks through numpy instead of using OpenCV operations, although I am unsure if this would really make a huge difference.

Any help would be appreciated, thank you!