huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134k stars 26.79k forks source link

Correct way to use pre-trained models - Any document on this? #13238

Closed pratikchhapolika closed 3 years ago

pratikchhapolika commented 3 years ago

I want to do Multiclass-Multilabel ( MLMC) classification problem using Conv-BERT model.

Steps that I have taken is:

I downloaded the Conv-Bert model from this link: https://huggingface.co/YituTech/conv-bert-base << YituTech/conv-bert-base>>

from pytorch_pretrained_bert import BertTokenizer, BertForSequenceClassification, BertAdam
tokenizer = **BertTokenizer.from_pretrained**("path_to_Conv-Bert_model", do_lower_case = True)
model = **BertForSequenceClassification.from_pretrained**("path_to_Conv-Bert_model", num_labels = 240)
model.cuda()

I want to understand can we call any classification module from Hugging face and pass any pre-trained models to it like Roberta, Conv-BERT.. so on. ? << As in above example>> Is it mandatory to use Conv-Bert classification pre-trained model ?

LysandreJik commented 3 years ago

Hello! We have several documents that can help you get started! First of all, the quicktour, and the free course of the HF ecosystem may help you out.

pratikchhapolika commented 3 years ago

Hello! We have several documents that can help you get started! First of all, the quicktour, and the free course of the HF ecosystem may help you out.

What about my code above. Is this correct way of doing things?

RahulSChand commented 3 years ago

@pratikchhapolika Hi Pratik, yes you can use most models for sequence classification. You can do the following

from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("name_of_base_model")
model = AutoModelForSequenceClassification("name_of_base_model")

//name_of_base_model can be bert-base-cased, albert-base-v2, roberta-large etc. 

The full list is here You can then use the model & finetune it on the desired classification task (e.g. GLUE / SUPERGLUE)

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.