XiangLi1999 / ContrastiveDecoding

contrastive decoding
167 stars 10 forks source link

Could you provide scripts to reproduce the results? #7

Open hzhwcmhf opened 1 year ago

hzhwcmhf commented 1 year ago

@XiangLi1999 Thanks for your great work!

I am trying to reproduce the results on wikitext but meet some problems.

I use your script:

python run_generation.py --model_name_or_path gpt2-xl --model_type gpt2 --length 256 --prompt_file wikitext --student_name_or_path gpt2 --st_coef 1.0   --student_temperature 0.5  --outfile outputs/temp_out.jsonl    --ignore_prefix no

And then evaluate the output file by:

python eval_script.py ./outputs/temp_out.jsonl

The output is

{'name': './outputs/temp_out.jsonl', 'rep-2': 9.5, 'rep-3': 1.87, 'rep-4': 0.4, 'diversity': 0.8845241939999999, 'mauve': 0.8812567264373257, 'coherence': 0.5913593170305366} (I disable other metrics)

which is different from reported results in the paper (coherence = 0.59 v.s. 0.69).

I find that ./outputs_ignorePrefix_ccnews_256/wikitext_results/wikitext_gpt2-1.0-t0.5_gpt2-xl_256.jsonl can produces correct metric values. May I ask two questions:

  1. What is the generation script used to produce the correct outputs?
  2. What does the values in wikitext_gpt2-1.0-t0.5_gpt2-xl_256.jsonl mean? For example, 256 seems output length, 0.5 is student temperature. What does 1.0 indicate?
XiangLi1999 commented 1 year ago

Hi

  1. I think setting ignore_prefix to yes would increase the coherence score! python run_generation.py --model_name_or_path gpt2-xl --model_type gpt2 --length 256 --prompt_file wikitext --student_name_or_path gpt2 --st_coef 1.0 --student_temperature 0.5 --outfile outputs/temp_out.jsonl --ignore_prefix yes

  2. The 1.0 is st_coef, it's like logP_expert - st_coef*logP_amateur.