ai-forever / ru-gpts

Russian GPT3 models.
Apache License 2.0
2.08k stars 445 forks source link

What does gpt3 based on gpt2 mean? #81

Closed gaardhus closed 2 years ago

gaardhus commented 2 years ago

I'm having a hard finding out what it specifically means that the models are gpt3 models based on gpt2? Does that mean that they are gpt2 models finetuned for russian? Do they have the same amount of parameters as gpt3? Etc.

Any clarification would be greatly appreciated, and sorry if I have overlooked something.

TatianaShavrina commented 2 years ago

@gaardhus Sure! This is a hack for huggingface lib compatibility: its function searches for a "gpt2" substring in a model name to import compatible classes for gpt-like models.

Also, we have used Megatron LM GPT2 code to recreate GPT3 according with their paper.

gaardhus commented 2 years ago

Awesome! Thanks for the clarification :)