Closed shamanez closed 3 years ago
The model card provides the necessary hyperparameters for training; see #3940 for an example invocation
This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.
The model card provides the necessary hyperparameters for training; see #3940 for an example invocation @klshuster
Hello! Seems like the model card you mentioned doesn't provide information about a large number of datasets used for training. There are like 7 datasets according to the parameter "multitask_weights" but only 4 are mentioned.
Hello! Seems like the model card you mentioned doesn't provide information about a large number of datasets used for training. There are like 7 datasets according to the parameter "multitask_weights" but only 4 are mentioned.
The BST task are listed as one, but are actually 4 tasks: ConvAI2, Empathetic Dialogues, Wizard of Wikipedia, and Blended Skill Talk. Add in WoI, MSC and BAD and you get 7.
@stephenroller, Hello! Thank you for helping. Why didn't you save the task data in the model.opt as was done in blenderbot 1?
The model was trained on some tasks before they were released publicly (woi, msc), and several flags/names changed in the open-source process, so we did not have a change to convert all of the .opt
parameters
It depends on what base model you want to use for your use case. I think using a model that already has truncation set to 1024 is your best bet
Is your feature request related to a problem? Please describe. I would like to finetune the blenderbot 2.0 with domain-specific data. For blenderbot 1.0, there is a script that explains the fine-tuning.
Describe the solution you'd like A script that shows the fine-tuning procedure.