Documentation on Finetuning Blenderbot 2.0 on your own dataset

facebookresearch / ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

https://parl.ai

MIT License

10.49k stars 2.1k forks source link

Documentation on Finetuning Blenderbot 2.0 on your own dataset #4014

Closed shamanez closed 3 years ago

shamanez commented 3 years ago

Is your feature request related to a problem? Please describe. I would like to finetune the blenderbot 2.0 with domain-specific data. For blenderbot 1.0, there is a script that explains the fine-tuning.

Describe the solution you'd like A script that shows the fine-tuning procedure.

klshuster commented 3 years ago

The model card provides the necessary hyperparameters for training; see #3940 for an example invocation

github-actions[bot] commented 3 years ago

This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.

DrMatters commented 2 years ago

The model card provides the necessary hyperparameters for training; see #3940 for an example invocation @klshuster

Hello! Seems like the model card you mentioned doesn't provide information about a large number of datasets used for training. There are like 7 datasets according to the parameter "multitask_weights" but only 4 are mentioned.

stephenroller commented 2 years ago

Hello! Seems like the model card you mentioned doesn't provide information about a large number of datasets used for training. There are like 7 datasets according to the parameter "multitask_weights" but only 4 are mentioned.

The BST task are listed as one, but are actually 4 tasks: ConvAI2, Empathetic Dialogues, Wizard of Wikipedia, and Blended Skill Talk. Add in WoI, MSC and BAD and you get 7.

DrMatters commented 2 years ago

@stephenroller, Hello! Thank you for helping. Why didn't you save the task data in the model.opt as was done in blenderbot 1?

klshuster commented 2 years ago

The model was trained on some tasks before they were released publicly (woi, msc), and several flags/names changed in the open-source process, so we did not have a change to convert all of the .opt parameters

klshuster commented 1 year ago

It depends on what base model you want to use for your use case. I think using a model that already has truncation set to 1024 is your best bet