RyanWangZf / MedCLIP

EMNLP'22 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Texts
394 stars 41 forks source link

Embedding running out of GPU memory #33

Open michaelgerloff opened 8 months ago

michaelgerloff commented 8 months ago

Hi, first of all: thanks for creating MedCLIP. It seems to be an amazing library. I'd like to embed several hundreds of images with the MedCLIPProcessor. However my GPU memory filled up rather fast. That's why I needed to copy each and every embedding to the CPU memory. This is of course rather slow. I tried to start the embedding on the CPU, but the input tensors (cuda.tensors) and weight tensors(torch.tensors) are not compatible with each other.

Is there a way to run the MedCLIPProcessor on batches of images? Is there a way to force the input tensors to normal torch.tensors? Is there a way to actually run the embedding process on a CPU?

Best, Michael

michaelgerloff commented 8 months ago

Notebook for embedding with MedCLIP multiple folders https://colab.research.google.com/drive/1QWGoDQpQMploWkBU70K84YC0yH955ww-?usp=sharing

Link to the dataset: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=157287455