ai-forever / ru-gpts

Russian GPT3 models.
Apache License 2.0
2.08k stars 444 forks source link

Training-test data contamination in the essay example #23

Closed mgrankin closed 3 years ago

mgrankin commented 3 years ago

I was excited to see outstanding perplexities (8 and 3) for essays generation. It’s so good that I decided to check for data leakage. Unfortunately, your training set data contains all validation set data. The resulting model could be overfit if you did more than one epoch (it’s impossible to know without a proper validation set).

king-menin commented 3 years ago

You are right. It was only example with very small dataset. In our other experiments perplexity is around 13-18.