Open chenyangh opened 1 month ago
Hi, thanks for reaching out! Sorry for the delay.
The default max length is 512 and we analyze the impact of different max length is Figure 4 in the Appendix. Note that for Table 2, our top-p is always 1, i.e., unbiased sampling. For Table 3, the numbers correspond to the best p
reported in the table.
As shown in Table 8 and explained in Appendix D.4, for WikiText, we can only compute c-mauve_100 because the we could not get 10K 200-token fragments from WikiText.
Hope this helps!
Hello,
I have a few questions while replicating the numbers using the provided checkpoints.
Thanks for making the code public.