模型显存占用问题

OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University

https://txsun1997.github.io/blogs/moss.html

Apache License 2.0

11.89k stars 1.15k forks source link

模型显存占用问题 #352

Closed wqh17101 closed 1 year ago

wqh17101 commented 1 year ago

我从 https://huggingface.co/fnlp/moss-moon-003-sft-plugin-int8 下载了模型然后使用

def init_model():
    model_dir = '/wjk/models/moss-moon-003-sft-int8'
    tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
    model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).half().cuda()
    model = model.eval()
    return model, tokenizer

加载了模型，但是却发现占用了31G的显存。和数据中的非量化一样。显卡：V100 torch 2.0.0 cuda11.8

wqh17101 commented 1 year ago

原来是我用错了模型类应该用 ··· def init_model(): model_dir = '/wjk/models/moss-moon-003-sft-int8' tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True).half().cuda() model = model.eval() return model, tokenizer ···