stefan-it / turkish-bert

Turkish BERT/DistilBERT, ELECTRA and ConvBERT models
482 stars 42 forks source link

Multiclass Classification #5

Closed ozcangundes closed 4 years ago

ozcangundes commented 4 years ago

Hi, I could not find any argument to pass number of classes in AutoModel codes to apply sentiment analysis with BERTurk. Do I miss something? Thanks in advance.

from transformers import AutoModelForSequenceClassification, AdamW, AutoConfig,AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
model = AutoModelForSequenceClassification.from_pretrained(
    "dbmdz/bert-base-turkish-cased")
stefan-it commented 4 years ago

Hi @ozcangundes ,

sorry for the late reply!

I think you should look into the glue example from the Transformers library (as they also do classification):

https://github.com/huggingface/transformers/blob/master/examples/run_glue.py#L215-L222

So for multi class I think you can use this code example:

https://github.com/huggingface/transformers/blob/master/examples/run_glue.py#L416

 all_labels = torch.tensor([f.label for f in features], dtype=torch.long)

Please let us know, if that works!

ozcangundes commented 4 years ago

Hi @stefan-it,

Thank you for your response. I could not actually solve it in your way but by passing num_label argument to config_class, it worked. Here is my solution:


config = AutoConfig.from_pretrained(
        "dbmdz/bert-base-turkish-cased",num_labels=4)
model = AutoModelForSequenceClassification.from_pretrained(
    "dbmdz/bert-base-turkish-cased",config=config)

model.cuda()
hosjiu1702 commented 2 years ago

Hi @stefan-it,

Thank you for your response. I could not actually solve it in your way but by passing num_label argument to config_class, it worked. Here is my solution:

config = AutoConfig.from_pretrained(
        "dbmdz/bert-base-turkish-cased",num_labels=4)
model = AutoModelForSequenceClassification.from_pretrained(
    "dbmdz/bert-base-turkish-cased",config=config)

model.cuda()

You do not need to use AutoConfig if what you care just only a configuration attribute which here in your case was num_labels.

All you need is setting keyword argument num_labels inside constructor of AutoModelForSequenceClassification as following:

model = AutoModelForSequenceClassification.from_pretrained("dbmdz/bert-base-turkish-cased", num_labels=4)