OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University
https://txsun1997.github.io/blogs/moss.html
Apache License 2.0
11.89k stars 1.15k forks source link

模型显存占用问题 #352

Closed wqh17101 closed 1 year ago

wqh17101 commented 1 year ago

我从 https://huggingface.co/fnlp/moss-moon-003-sft-plugin-int8 下载了模型 然后使用

def init_model():
    model_dir = '/wjk/models/moss-moon-003-sft-int8'
    tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
    model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).half().cuda()
    model = model.eval()
    return model, tokenizer

加载了模型,但是却发现占用了31G的显存。和数据中的非量化一样。 显卡:V100 torch 2.0.0 cuda11.8

wqh17101 commented 1 year ago

image

wqh17101 commented 1 year ago

原来是我用错了模型类 应该用 ··· def init_model(): model_dir = '/wjk/models/moss-moon-003-sft-int8' tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True).half().cuda() model = model.eval() return model, tokenizer ···