FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
6.13k stars 442 forks source link

使用不同方法计算得分出现torch.cuda.OutOfMemoryError: CUDA out of memory. #439

Open ly19970621 opened 5 months ago

ly19970621 commented 5 months ago

GPU:4*RTX 4090 24G 代码是:

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel('BAAI/bge-m3',  use_fp16=True) 

sentences_1 = ["What is BGE M3?", "Defination of BM25"]
sentences_2 = ["BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.", 
               "BM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document"]

sentence_pairs = [[i,j] for i in sentences_1 for j in sentences_2]
print(model.compute_score(sentence_pairs))

用四张显卡跑都出现OOM,请问是什么情况?

staoxiao commented 5 months ago

您好,compute_score里默认最大长度是8192。之前的代码会都padding到最大长度8192,因此会消耗很多显存。 代码已更新,目前会安装输入的最大长度进行padding,可以安装最新的代码进行尝试。 也可以通过设置max_passage_length来控制长度。