Closed bleedingfight closed 6 months ago
cc @younesbelkada
@bleedingfight hey!
Yes, the transformers
currently returns prompt+generated_text as output for generative models. If you want only the generates part you can:
out = model.generate(**inputs)
out_wo_prompt = out[ : , inputs.input_ids.shape[-1] : ]
print(tokenizer.batch_decode(out_wo_prompt, skip_special_tokens=True))
@zucchini-nlp ok,thanks
System Info
transformers
version: 4.39.3Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
MyModel is llava from official:vicuna+clip+mmprojector。 convert script:
I only modified a little bit of the code so that I could save the model directly locally instead of pushing it to the hf repository
"A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: <image>\n下面的文章描述了一个实验。阅读文章,然后按照以下说明进行操作。\n\nMadelyn在雪板的底部涂上了一层薄蜡,然后直接下坡滑行。然后,她去掉了蜡,再次直接下坡滑行。她重复了这个过程四次,每次都交替使用薄蜡或不使用薄蜡滑行。她的朋友Tucker计时每次滑行的时间。Madelyn和Tucker计算了使用薄蜡滑行和不使用薄蜡滑行时直接下坡所需的平均时间。\n图:滑雪板下坡。\n麦德琳和塔克的实验能最好回答哪个问题?\nA. 当麦德琳的雪板上有一层薄蜡或一层厚蜡时,它是否能在较短的时间内滑下山坡?\nB. 当麦德琳的雪板上有一层蜡或没有蜡时,它是否 能在较短的时间内滑下山坡?\n请直接回答选项字母。 ASSISTANT:"
config:
{'max_new_tokens': 1024, 'temperature': 0.0, 'top_p': None, 'num_beams': 1, 'use_cache': True
tranformers:here is output:
llava official:
B