princeton-nlp / MQuAKE

[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
https://arxiv.org/abs/2305.14795
MIT License
99 stars 7 forks source link

How to use the Vicuna model? #3

Open sev777 opened 1 year ago

sev777 commented 1 year ago

The Vicuna model will generate some unrelated output, so how to control the max_lrngth in : model.generate(inputs.input_ids.cuda(), max_length=??)

a3616001 commented 1 year ago

Hi @sev777 , as long as you use in-context learning with a few demonstrations in the prompt, the output should follow the demonstrations' format in most cases. See the prompts we used in our experiments. For max_length, you may just use something long enough (e.g., max_length=300 for CoT cases) and then extract the actual prediction from the output.

sev777 commented 1 year ago

Hi @sev777 , as long as you use in-context learning with a few demonstrations in the prompt, the output should follow the demonstrations' format in most cases. See the prompts we used in our experiments. For max_length, you may just use something long enough (e.g., max_length=300 for CoT cases) and then extract the actual prediction from the output.

Thank you for the reply. But when I use the model from: https://huggingface.co/lmsys/vicuna-7b-v1.3
the generate code is as follow: inputs = tokenizer(cur_prompt, return_tensors="pt") generate_ids = model.generate(inputs.input_ids.cuda(), max_length=inputs.input_ids.shape[1]+100) res=tokenizer.batch_decode(generate_ids[:,inputs.input_ids.shape[1]:], skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] The _curprompt is same as the task_prompt in run_mello.ipynb. But the res is : ##รভ secretaryகष [unused327] answer :รভ secretaryகष ... Should I use Vicuna-chat? Thanks again!