Open yzc111 opened 8 months ago
Hi,
Which config are you using? Vicuna and llama2 models have a 4k context window limit, which limits how many passages you can use in the context.
Hi, thank you for your reply, the config is 2 shot, 3 ndoc
Did you use the "light instruction" version as well?
NO, I just use the default setting
Can you try this config (but change the model name): https://github.com/princeton-nlp/ALCE/blob/main/configs/asqa_alpaca-7b_shot2_ndoc3_gtr_light_inst.yaml
OK. thanks~
another question, when I use the setting prompt_file: prompts/asqa_light_inst.json eval_file: data/asqa_eval_gtr_top100.json shot: 2 ndoc: 3 dataset_name: asqa tag: gtr_light_inst model: vicuna-13b temperature: 1.0 top_p: 0.95 to reproduce the result, I get the QA-EM=19.7 and mauve=70.7. the paper reports EM=31.9 mauve=82.6. are there any different settings in the config file?
Note that there is a difference between EM and QA-EM, and we report EM in the paper. Can you post the full output or .score
file? Can you also post the link to the vicuna model that you are using? There are a couple different versions with different performances.
Hi. this is the config of we used to reproduce the result on vicuna-13B prompt_file: prompts/asqa_light_inst.json eval_file: data/asqa_eval_gtr_top100.json shot: 2 ndoc: 3 dataset_name: asqa tag: gtr_light_inst model: /work/models/vicuna-13b temperature: 1.0 top_p: 0.95
so, how can I get the EM score of your paper reported?
That is “str_em"
Fine,Thanks
Hello, when I reproduce the results on Vicuna-13B and Llams2-7B , I can not get any model output, and the code outputs the warning:"Prompt exceeds max length and return an empty string as answer. If this happens too many times, it is suggested to make the prompt shorter", How to deal with this phenomenon? Thank you~