Open ChenZhongPu opened 8 months ago
I would like to use
sentence-transformers
in a low-end machine (CPU-only) to load pre-trained models, such asparaphrase-multilingual-MiniLM-L12-v2
, and compute a sentence's embedding.How to estimate memory usage? Is there any guideline to describe the minimum system requirements for loading pre-trained models?
See my analysis here that I just did.
https://github.com/michaelfeil/infinity/discussions/160#discussioncomment-8856128
Hello!
There are no official guidelines I'm afraid, but I've used this Space before to help estimate: https://huggingface.co/spaces/hf-accelerate/model-memory-usage
You can set the Library
to transformers
(sentence-transformers
does not incur a lot of overhead on top of transformers
), and enter the model of interest, e.g. sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
:
So, presumably you'll want 448.82MB for this model when running it on float32
(the default), based on the Total Size under Memory Usage for 'sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2'
I hope that helps.
You can also review the discussion here for more details on this issue. There's some disagreement, but testing is still ongoing and it's at least a robust discussion...Hope it helps.
I would like to use
sentence-transformers
in a low-end machine (CPU-only) to load pre-trained models, such asparaphrase-multilingual-MiniLM-L12-v2
, and compute a sentence's embedding.How to estimate memory usage? Is there any guideline to describe the minimum system requirements for loading pre-trained models?