duggurd / graduation_project

0 stars 1 forks source link

Test a smaller pre-trained transformer #23

Closed duggurd closed 1 year ago

duggurd commented 1 year ago

Try smaller transformer model than 66 mil parameters. Pre-trained is probably the better choice.

Bert tiny

about 4mil parameters

BERT-tiny

Bert mini

About 12 mil parameters BERT-mini

From scratch

Create own smaller model with BertConfig/DistilBertConfig