Open Neural Network Exchange (ONNX) compatible implementation of DeDoDe 🎶 Detect, Don't Describe - Describe, Don't Detect, for Local Feature Matching. Supports TensorRT 🚀.
The DeDoDe detector learns to detect 3D consistent repeatable keypoints, which the DeDoDe descriptor learns to match. The result is a powerful decoupled local feature matcher.
DeDoDe ONNX TensorRT provides a 2x speedup over PyTorch.
Prior to exporting the ONNX models, please install the requirements.
To convert the DeDoDe models to ONNX, run export.py
. We provide two types of ONNX exports: individual standalone models, and a combined end-to-end pipeline (recommended for convenience) with the --end2end
flag.
python export.py \ --img_size 256 256 \ --end2end \ --dynamic_img_size --dynamic_batch \ --fp16
If you would like to try out inference right away, you can download ONNX models that have already been exported here or run ./weights/download.sh
.
With ONNX models in hand, one can perform inference on Python using ONNX Runtime (see requirements-onnx.txt).
The DeDoDe inference pipeline has been encapsulated into a runner class:
from onnx_runner import DeDoDeRunner
images = DeDoDeRunner.preprocess(image_array)
# images.shape == (2B, 3, H, W)
# Create ONNXRuntime runner
runner = DeDoDeRunner(
end2end_path="weights/dedode_end2end_1024.onnx",
providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
# TensorrtExecutionProvider
)
# Run inference
matches_A, matches_B, batch_ids = runner.run(images)
matches_A = DeDoDeRunner.postprocess(matches_A, H_A, W_A)
matches_B = DeDoDeRunner.postprocess(matches_B, H_B, W_B)
Alternatively, you can also run infer.py
.
python infer.py \ --img_paths assets/im_A.jpg assets/im_B.jpg \ --img_size 256 256 \ --end2end \ --end2end_path weights/dedode_end2end_1024_fp16.onnx \ --fp16 \ --viz
TensorRT offers the best performance and greatest memory efficiency.
TensorRT inference is supported for the end-to-end model via the TensorRT Execution Provider in ONNXRuntime. Please follow the official documentation to install TensorRT. The exported ONNX models must undergo shape inference for compatibility with TensorRT.
python tools/symbolic_shape_infer.py \ --input weights/dedode_end2end_1024.onnx \ --output weights/dedode_end2end_1024_trt.onnx \ --auto_merge
CUDA_MODULE_LOADING=LAZY && python infer.py \ --img_paths assets/DSC_0410.JPG assets/DSC_0411.JPG \ --img_size 256 256 \ --end2end \ --end2end_path weights/dedode_end2end_1024_trt.onnx \ --trt \ --viz
The first run will take longer because TensorRT needs to initialise the .engine
and .profile
files. Subsequent runs should use the cached files. Only static input shapes are supported. Note that TensorRT will rebuild the cache if it encounters a different input shape.
The inference times of the end-to-end DeDoDe pipelines are shown below.
# Keypoints | 1024 | 2048 | 3840 | 4096 | 8192 |
---|---|---|---|---|---|
Latency (ms) (RTX 4080 12GB) | |||||
PyTorch | 169.72 | 170.42 | N/A | 176.18 | 189.53 |
PyTorch-MP | 79.42 | 80.09 | N/A | 83.8 | 96.93 |
ONNX | 170.84 | 171.83 | N/A | 180.18 | 203.37 |
TensorRT | 78.12 | 79.59 | 94.88 | N/A | N/A |
TensorRT-FP16 | 33.9 | 35.45 | 42.35 | N/A | N/A |
If you use any ideas from the papers or code in this repo, please consider citing the authors of DeDoDe. Lastly, if the ONNX or TensorRT versions helped you in any way, please also consider starring this repository.
@article{edstedt2023dedode,
title={DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local Feature Matching},
author={Johan Edstedt and Georg Bökman and Mårten Wadenbäck and Michael Felsberg},
year={2023},
eprint={2308.08479},
archivePrefix={arXiv},
primaryClass={cs.CV}
}