google-research / omniglue

Code release for CVPR'24 submission 'OmniGlue'
https://hwjiang1510.github.io/OmniGlue
Apache License 2.0
439 stars 38 forks source link

Question: inference takes long time #5

Open lyp-deeplearning opened 1 month ago

lyp-deeplearning commented 1 month ago

Thank you for providing the source code for this interesting work. However, I have a question regarding the inference time. On my device ( RTX 3090 (24GB)), a single inference takes 2.92 seconds (average of 100 runs), whereas the paper reports that it can achieve about 50 fps. I look forward to your response.

hwjiang1510 commented 1 month ago

The released code is based on tensorflow without using any efficient transformer implementation. The reported number is based on a re-implementation using pytorch glue-factory.

arjunkarpur commented 1 month ago

@lyp-deeplearning one other thing to mention:

The TF models should take advantage of the GPU automatically, but the PyTorch DINOv2 code needs some modifications to dino_extract.py:

After this, hopefully all models are run on GPU and you should see some inference latency improvements.