Stage 1: Fixing the Pre/Post Processing Steps for Proper Bounding Box Scaling
[x] Take a look at the Ultralytics source code (examples/Ultralytics Module/model.py, predictor.py, results.py) and make notes/copy the relevant code for preprocessing images and postprocessing bounding box data
[ ] Improve the Python MVP package by integrating that code for preprocessing/postprocessing into the relevant ROS2 nodes
[ ] Test the package inside the Jetson and make sure the extermination node shows properly scaled bounding boxes at the locations we expect
Stage 2: Accelerating the Pre/Post Processing Steps with CUDA
[ ] Utilize the CuPy library to accelerate the NumPy operations and compare the results
[ ] Experiment with some alternatives for CUDA pre/post processsing and compare the results
[ ] ex. PyTorch NMS module
[ ] ex. OpenCV CUDA Module
[ ] ex. Jax?
[ ] ex. Numba?
[ ] Try executing operations against different object types
[ ] ex. OpenCV gpu.mat
[ ] ex. CuPy Matrix
[ ] ex. CUDA Stream
Stage 3: TensorRT Engine Conversion/Inference
[ ] Investigate using TensorRT Engine Plugins to bake the pre/post processing into the engine file itself (with amy)
[ ] Repair the comparison scripts under /python_wip and /conversion_tools and report on results of different methods
[ ] For all the verify functions run them for a few random inputs and average values (warmup)
[ ] Copy logic for comparing the prediction consistency and confidence independently using valery's functions in ONNX_verify
Stage 1: Fixing the Pre/Post Processing Steps for Proper Bounding Box Scaling
Stage 2: Accelerating the Pre/Post Processing Steps with CUDA
Stage 3: TensorRT Engine Conversion/Inference