vivo-ai-lab / BlueLM

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab
https://developers.vivo.com/product/ai/bluelm
Other
846 stars 58 forks source link

如何让模型仅输出答案? #8

Closed howardgriffin closed 11 months ago

howardgriffin commented 11 months ago

就像问“三国演义的作者是谁?”时,直接输出罗贯中,而不输出“三国演义的作者是谁?罗贯中.......”之类的,这种怎么实现呢?

JoeyHeisenberg commented 11 months ago

可以通过读取问题的token长度,然后在获得模型输出后进行截断即可

howardgriffin commented 11 months ago

由于输出的结果耗时和输出文字的token长度相关,那是不是说耗时上,除了答案的耗时,又白白损失了问题输出的时间?

JoeyHeisenberg commented 11 months ago

耗时大头基本在模型生成部分,tokenizer decode基本不耗时(而且你可以选择在decode之前进行截断),具体可以体验下我们两个demo,也可自行测试下速度

howardgriffin commented 11 months ago

感谢您的指导,现在在decoder之后截断我会弄,但是在decoder之前截断的方式,能提供下示例代码吗?

JoeyHeisenberg commented 11 months ago

类似这样

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("vivo/BlueLM-7B-Chat", trust_remote_code=True, use_fast=False)
model = AutoModelForCausalLM.from_pretrained("vivo/BlueLM-7B-Chat", device_map="cpu", trust_remote_code=True)
model = model.eval()
sentences = "今天天气怎么样?"
inputs = tokenizer("[|Human|]:" + sentences + "[|AI|]:", return_tensors="pt")
inputs = inputs.to("cpu")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs.cpu()[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
print(tokenizer.decode(outputs.cpu()[0], skip_special_tokens=True)[len(sentences):].strip())