speeding up yolov5 megadetector inference

rbavery commented 1 year ago

Inference for the fully reproduced megadetector v5a model is currently about 9 seconds per image. This PR speeds this up by:

compiling to ONNX, independent of image size changes
reducing image size while preserving performance as much as possible
we did not do any other optimizations (NeuralMagic, or direct custom ONNX sparsify)

See the README.md for instructions on getting started with downloading model weights, packaging the model, running the torchserve container, and sending image post requests. This adds two notebooks that can be used to

1) compare models on folders of images or 2) run single image inference and debug each step locally and compare with the torchserve container results.

rbavery commented 1 year ago

Inference is currently about 8 seconds per image. This branch is for investigating how to speed this up by:

compiling to torchscript, independent of image size changes

reducing image size while preserving performance as much as possible, potentially with multiple compiled torchscript models for different image sizes

other optimizations (NeuralMagic, ONNX, TensorRT)

See the README.md for instructions on getting started with downloading model weights, packaging the model, running the torchserve container, and sending image post requests.

Goal: Average inference time per image at 2-3 seconds. We were able to achieve this by resizing all images to 640x640 px and using a torchscript model compiled for this size, but this degraded performance.

rbavery commented 1 year ago

After compiling to ONNX we get inference speeds of 1.7 seconds vs ~5 seconds for no compilation! This is on my local desktop. We'll test this on an endpoint early next week.

tnc-ca-geo / animl-ml

speeding up yolov5 megadetector inference #105