Questions about bart.large.mnli and fine tuning bart - Githubissues

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

MIT License

30.22k stars 6.38k forks source link

Questions about bart.large.mnli and fine tuning bart #5506

Open sailerco opened 3 months ago

sailerco commented 3 months ago

❓ Questions and Help

I'm researching Zero Shot Classification and I'm using the facebook/bart-large-mnli which is based on your bart.large.mnli model. I cannot work out how the model was originally fine-tuned on the bart model. As far as I'm aware, there are three common types of fine-tuning (see image).

fine tuning

So these questions came to me:

What type of fine tuning was originally used for the bart.large.mnli model? Are all layers upadeted or just the classifier?
Are there any other resources about the mnli model other than the short description in the paper? Is there a paper about it's origin/development?
Are there tutorials that show how to correctly fine tune bart.large.mnli (or I guess the hugging face model) that work with custom NLI datasets?