Open ErfolgreichCharismatisch opened 2 years ago
Loading from sentence_transformers import SentenceTransformer, util costs 228 additional MB in memory, model = SentenceTransformer('path\\to\\sentence_transformers\\sentence-transformers_msmarco-distilbert-multilingual-en-de-v2-tmp-trained-scratch', device='cuda') costs 1.1 GB.
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('path\\to\\sentence_transformers\\sentence-transformers_msmarco-distilbert-multilingual-en-de-v2-tmp-trained-scratch', device='cuda')
with open(pkl, "rb") as fIn: stored_data = pickle.load(fIn) doc_emb = stored_data['doc_emb']
costs another 971 MB, query_emb = model.encode(query, batch_size=6) another 867 MB.
query_emb = model.encode(query, batch_size=6)
How to organize memory more efficiently?
Transformers model costs quite a lot of memory. You can try to use quantization to reduce the model size
Loading
from sentence_transformers import SentenceTransformer, util
costs 228 additional MB in memory,model = SentenceTransformer('path\\to\\sentence_transformers\\sentence-transformers_msmarco-distilbert-multilingual-en-de-v2-tmp-trained-scratch', device='cuda')
costs 1.1 GB.costs another 971 MB,
query_emb = model.encode(query, batch_size=6)
another 867 MB.How to organize memory more efficiently?