如何让模型仅输出答案？

howardgriffin commented 11 months ago

就像问“三国演义的作者是谁？”时，直接输出罗贯中，而不输出“三国演义的作者是谁？罗贯中.......”之类的，这种怎么实现呢？

JoeyHeisenberg commented 11 months ago

可以通过读取问题的token长度，然后在获得模型输出后进行截断即可

howardgriffin commented 11 months ago

由于输出的结果耗时和输出文字的token长度相关，那是不是说耗时上，除了答案的耗时，又白白损失了问题输出的时间？

JoeyHeisenberg commented 11 months ago

耗时大头基本在模型生成部分，tokenizer decode基本不耗时（而且你可以选择在decode之前进行截断），具体可以体验下我们两个demo，也可自行测试下速度

howardgriffin commented 11 months ago

感谢您的指导，现在在decoder之后截断我会弄，但是在decoder之前截断的方式，能提供下示例代码吗？

JoeyHeisenberg commented 11 months ago

类似这样

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("vivo/BlueLM-7B-Chat", trust_remote_code=True, use_fast=False)
model = AutoModelForCausalLM.from_pretrained("vivo/BlueLM-7B-Chat", device_map="cpu", trust_remote_code=True)
model = model.eval()
sentences = "今天天气怎么样？"
inputs = tokenizer("[|Human|]:" + sentences + "[|AI|]:", return_tensors="pt")
inputs = inputs.to("cpu")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs.cpu()[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
print(tokenizer.decode(outputs.cpu()[0], skip_special_tokens=True)[len(sentences):].strip())

vivo-ai-lab / BlueLM

如何让模型仅输出答案？ #8