Open sev777 opened 1 year ago
Hi @sev777 , as long as you use in-context learning with a few demonstrations in the prompt, the output should follow the demonstrations' format in most cases. See the prompts we used in our experiments.
For max_length
, you may just use something long enough (e.g., max_length=300
for CoT cases) and then extract the actual prediction from the output.
Hi @sev777 , as long as you use in-context learning with a few demonstrations in the prompt, the output should follow the demonstrations' format in most cases. See the prompts we used in our experiments. For
max_length
, you may just use something long enough (e.g.,max_length=300
for CoT cases) and then extract the actual prediction from the output.
Thank you for the reply.
But when I use the model from: https://huggingface.co/lmsys/vicuna-7b-v1.3
the generate code is as follow:
inputs = tokenizer(cur_prompt, return_tensors="pt") generate_ids = model.generate(inputs.input_ids.cuda(), max_length=inputs.input_ids.shape[1]+100) res=tokenizer.batch_decode(generate_ids[:,inputs.input_ids.shape[1]:], skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
The _curprompt is same as the task_prompt in run_mello.ipynb.
But the res is : ##รভ secretaryகष [unused327] answer :รভ secretaryகष ...
Should I use Vicuna-chat?
Thanks again!
The Vicuna model will generate some unrelated output, so how to control the max_lrngth in : model.generate(inputs.input_ids.cuda(), max_length=??)