ShirAmir / dino-vit-features

Official implementation for the paper "Deep ViT Features as Dense Visual Descriptors".
https://dino-vit-features.github.io
MIT License
383 stars 44 forks source link

CUDA out of memory #6

Closed Reagan1311 closed 2 years ago

Reagan1311 commented 2 years ago

I have GPUs with 11GB of memory, and I will get out of memory warning when I load more than three images. (when computing the attention of ViT, attn = (q @ k.transpose(-2, -1)) * self.scale)

I think I can increase the stride or decrease the load size, but it will also degrade the performance.

I found the code only processes a single image each time, so I would like to ask if I can run the program across multiple GPUs?

ShirAmir commented 2 years ago

What is the resolution of the images you use? The code should work fine on images circa 300-400 pixels per dimension. There are no current plans to support multiple GPU in this codebase. You could however divide the code to two different scripts. The first does feature extraction for a single image on GPU, and stores it in memory. The other loads features from different images and applies the applications on it (using CPU alone). This will alleviate the need to hold many images on GPU in parallel.

Reagan1311 commented 2 years ago

Thanks for the reply, I will close the issue now : )