Link to Grossmend's model

Kirili4ik / ruDialoGpt3-finetune-colab

Ready-for-use-colab tutorial for finetuning ruDialoGpt3 model on а telegram chat using HuggingFace and PyTorch

https://colab.research.google.com/drive/1fnAVURjyZRK9VQg1Co_-SKUQnRES8l9R?usp=sharing

MIT License

30 stars 6 forks source link

Link to Grossmend's model #1

Open nikich340 opened 1 year ago

nikich340 commented 1 year ago

Hello. It seems that original Grossmend's model was deleted: https://huggingface.co/Grossmend/rudialogpt3_medium_based_on_gpt2 Do you still have that model by a chance?

Kirili4ik commented 1 year ago

Oh, that's unfortunate.. If you are looking for a model in Russian you can try these ones: https://huggingface.co/DeepPavlov/rudialogpt3_medium_based_on_gpt2_v2 https://huggingface.co/tinkoff-ai/ruDialoGPT-medium Or search here https://huggingface.co/models.

I don't have a copy of the Grossmend's model on me =(

nikich340 commented 1 year ago

I see, thank you! Do you recommend using rugpt-3 trained on dialog data (one of mentioned by you, for example) or start fine-tuning from "vanilla" sber rugpt3-medium/small, if my dataset is quite small (1500 pairs "context - answer") ?

Kirili4ik commented 1 year ago

If you have pairs context-answer you should look into training models like T5/mT5/flan-T5/ru-T5 or other text-2-text models instead of just generators like GPT-2/3 But generally it's better to go with the model that is trained on the closest data to yours (if the model is trained well and not overfitted)

Sorry for such a slow reply