Open hzhwcmhf opened 1 year ago
Hi
I think setting ignore_prefix to yes would increase the coherence score!
python run_generation.py --model_name_or_path gpt2-xl --model_type gpt2 --length 256 --prompt_file wikitext --student_name_or_path gpt2 --st_coef 1.0 --student_temperature 0.5 --outfile outputs/temp_out.jsonl --ignore_prefix yes
The 1.0 is st_coef, it's like logP_expert - st_coef*logP_amateur.
@XiangLi1999 Thanks for your great work!
I am trying to reproduce the results on wikitext but meet some problems.
I use your script:
And then evaluate the output file by:
The output is
which is different from reported results in the paper (coherence = 0.59 v.s. 0.69).
I find that
./outputs_ignorePrefix_ccnews_256/wikitext_results/wikitext_gpt2-1.0-t0.5_gpt2-xl_256.jsonl
can produces correct metric values. May I ask two questions:wikitext_gpt2-1.0-t0.5_gpt2-xl_256.jsonl
mean? For example, 256 seems output length, 0.5 is student temperature. What does 1.0 indicate?