Open lyp-deeplearning opened 1 month ago
The released code is based on tensorflow without using any efficient transformer implementation. The reported number is based on a re-implementation using pytorch glue-factory.
@lyp-deeplearning one other thing to mention:
The TF models should take advantage of the GPU automatically, but the PyTorch DINOv2 code needs some modifications to dino_extract.py:
self.model.cuda()
to send the model to GPU memout = self.model.get_intermediate_layers(image.cuda(), n=self.feature_layer)[0]
- i.e., send image to GPU mem with a .cuda()
callAfter this, hopefully all models are run on GPU and you should see some inference latency improvements.
Thank you for providing the source code for this interesting work. However, I have a question regarding the inference time. On my device ( RTX 3090 (24GB)), a single inference takes 2.92 seconds (average of 100 runs), whereas the paper reports that it can achieve about 50 fps. I look forward to your response.