allenai / unifew

Unifew: Unified Fewshot Learning Model
Apache License 2.0
18 stars 3 forks source link

How to use UniFew with BERT pretrained model? #2

Open LiweiPeng opened 2 years ago

LiweiPeng commented 2 years ago

Hi,

The results of UniFew on fewshot are very impressive. The current UniFew is using UnifiedQA pretrained model (T5 or BART based). Because our pretrained models are BERT-based, I have several questions: 1) Do you have plan to support UniFew with BERT (or RoBERTa) pretrained models? 2) If not, what's required to make UniFew to work with BERT-like pretrained models? Thanks, Liwei

armancohan commented 2 years ago

Hi, Thanks for your interest. UniFew would not work with RoBERTa or BERT-based models. In UniFew we format downstream tasks as multiple choice QA, so it is natural to use UnifedQA because it is pretrained on question answering tasks, and switching to BERT/RoBERTa which are pretraiend with masked language modeling (MLM) wouldn't work. As explained in the paper, we used UnifiedQA to minimize prompt engineering and avoid other complex tricks of prior work in adapting MLM-based models to few-shot setting.
Curious, what is the main reason you would want to switch to BERT-like models?

LiweiPeng commented 2 years ago

@armancohan Really appreciate your quick response. All of our current downstream tasks are using a BERT model which we pretrained from scratch using in-house domain data. Switching to other models like UnifiedQA means we may need to pretrain our own UnifiedQA model, which may take significant efforts and GPU resources. This is the main reason that I'd like to stick to BERT if possible. As you said, UniFew won't work with BERT-based models. I'll take a deep look at UnifiedQA on what we can do from there. Thanks.