allenai / scibert

A BERT model for scientific text.
https://arxiv.org/abs/1903.10676
Apache License 2.0
1.47k stars 214 forks source link

How to apply sciBERT on binary classification in PyTorch #80

Open JiayuanDing100 opened 4 years ago

JiayuanDing100 commented 4 years ago

Hi Thanks for your awesome work of domain BERT model. I just tried the pre-trained BERT in PyTorch for binary classification using this link. https://medium.com/swlh/a-simple-guide-on-using-bert-for-text-classification-bbf041ac8d04

Is there any simple tutorial for SciBert on binary classification in PyTorch? Thanks

amandalmia14 commented 3 years ago

@JiayuanDing100 were you able to get any inputs?

ibeltagy commented 3 years ago

You would follow the same recipe except replacing bert-base-cased with one of the scibert models, allenai/scibert_scivocab_uncased for example.

Side note, you might find better trainers in the HF examples https://github.com/huggingface/transformers/tree/master/examples/text-classification. The blog post is a little outdated.

amandalmia14 commented 3 years ago

Thanks a lot @ibeltagy

jaihonikhil commented 3 years ago

Can you please explain how can i replace bert with scibert for BERT_FOR_SEQUENCE_CLASSIFICATION ?

adeepH commented 3 years ago

@jaihonikhil You could utilize the AutoModel and call SciBert using AutoModel.from_pretrained('allenai/scibert_scivocab_uncased'). Then have a linear layer for sequence classification just like any other bert model.