karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
37.55k stars 5.99k forks source link

Perplexity #548

Open Precola opened 3 months ago

Precola commented 3 months ago

When I want to compute the perplexity of GPT2, how many epoches is suitable for training the GPT2 model?

When the model is trained for 400 epoches, the perplexity is about 35. Will the perplexity go down after taking more epoches?