Closed gaardhus closed 2 years ago
@gaardhus Sure! This is a hack for huggingface lib compatibility: its function searches for a "gpt2" substring in a model name to import compatible classes for gpt-like models.
Also, we have used Megatron LM GPT2 code to recreate GPT3 according with their paper.
Awesome! Thanks for the clarification :)
I'm having a hard finding out what it specifically means that the models are gpt3 models based on gpt2? Does that mean that they are gpt2 models finetuned for russian? Do they have the same amount of parameters as gpt3? Etc.
Any clarification would be greatly appreciated, and sorry if I have overlooked something.