graykode / gpt-2-Pytorch

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
MIT License
971 stars 227 forks source link

Discrepancy in Parameter Size of Smallest Model #25

Open mmdalix opened 1 year ago

mmdalix commented 1 year ago

I have been using an implementation of GPT-2 from your repository and noticed that the size of the smallest GPT-2 model available in the repository differs from the smallest model mentioned in the original paper of GPT-2. Specifically, the size of the parameters of the smallest model in the repository is about 124M but the smallest model in original paper is 117M

I am curious to know why there is this difference