AnswerDotAI / byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.
Apache License 2.0
626 stars 60 forks source link

Model is not being offloaded from VRAM #35

Open nishithshowri006 opened 1 month ago

nishithshowri006 commented 1 month ago

I am trying to run the model in Jupyter notebook.

image

  1. In the above iteration I haven't initialized the model.

image

  1. Now I run the cell the model is loaded and it is showing 6GB of vram occupied right.

image

  1. Now when I run the cell again the vram usage is doubled.
  2. In the consequent runs the model is not occupying more than 12GB but what's interesting thing I have observed is when I am running that inside a loop for suppose I want to create an Index for each file I have, I don't have any other option than do this but this is causing the model to give me vram issues. How do I remove them from vram, I tried torch cuda cache free, tried to delete the variable none isn't working for me. Can you please help or is there something I am doing wrongly ?
bclavie commented 2 weeks ago

Could you provide your notebook as a Colab notebook so I can more easily reproduce the exact issue? Thank you!

nishithshowri006 commented 2 weeks ago

Hey colab notebook this is just a basic observation I had. You might have more understanding than me, I added comments in the notebook on what I observed.

DebopamParam commented 1 week ago

I believe, I can help with this issue. Could you assign the issue to me?