Could you elaborate on why validation doesn't really help in finding overfitting?

minimaxir / aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.

https://docs.aitextgen.io

MIT License

1.84k stars 220 forks source link

Could you elaborate on why validation doesn't really help in finding overfitting? #153

Closed redthing1 closed 3 years ago

redthing1 commented 3 years ago

I was curious about this in your features page:

Validation is deliberately not implemented since it serves as more of a crutch and doesn't provide that much help in identifying overfitting

Could you explain a bit? I was under the impression that validation was a useful metric for training causal language models.

minimaxir commented 3 years ago

It's good for objective metrics (which is what the language modeling research papers use), but in the end for text generation, there's a lot of subjectivity (especially with temperatures) which a validation pass don't help as much to identify.