facebookresearch / ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
https://parl.ai
MIT License
10.49k stars 2.1k forks source link

can metaseq 125M to 66B OPT model can be used as BB3 model just like 175B model work #4988

Closed MrD005 closed 1 year ago

MrD005 commented 1 year ago

i am trying to change BB3 2.7B model with 6.6B OPT model using metaseq but it is not working for me. if anyone has trying something like that or any other way of achieving it apart from metaseq. I am currently trying to use alpa github repo to test but any suggestions will be helpful

klshuster commented 1 year ago

could you elaborate on what you're trying to accomplish?

bb3 2.7B is not an OPT-based model. bb3 30b and bb3 175b are fine-tuned variants of OPT models

MrD005 commented 1 year ago

i am using BB3 3B model and now i want to use bb3 30B model using BB3OPT API using metaseq but there is big hardware requirement gap in 3B and 30B model so i am trying to use OPT 6B or 13B similarly using metaseq that we following in 30B and 175B model.

MrD005 commented 1 year ago

Is there any way that i can fine tune OPT 6B to BB3 6B model. similarly like 30B is fine tune variants of OPT model

klshuster commented 1 year ago

I would take a look at the metaseq training code. You would need to convert the BB3 training tasks to a metaseq format and train a model there

github-actions[bot] commented 1 year ago

This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.