google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
38.15k stars 9.6k forks source link

How do multi label classification #371

Open cvenour opened 5 years ago

cvenour commented 5 years ago

How would I modify run_classifier.py in order to get it to do multi class classification? What I mean by that is an item in the dataset can belong to more than 1 class - not just one class. For example a sentence can be assigned multiple labels such as 'weather' and 'safety'. (i.e. the sentence is about weather and about safety).

rodgzilla commented 5 years ago

An easy to do this would be to "plug" multiple classification heads (linear layers) on top of the core network performing a one-versus-rest binary classification task for each label.

hsm207 commented 5 years ago

@rodgzilla Should any changes be made to the loss function too?

rodgzilla commented 5 years ago

If you have n different labels, you should see the output of the model as n binary classification outputs.

(disclaimer: I'm not used to work with TF, this might not be the proper way to do it)

Let's say you have a batch_size of 10, 4 different labels and the input X has label [True, False, True, False], meaning it belongs to the class 0 and 2.

If you use the methodology that I described, the output of your model will be of shape [batch_size, n_labels, 2] (in our case [10, 4, 2]) and the target is of shape [batch_size, n_labels] (in our case [10, 4]).

To use tf.nn.sparse_softmax_cross_entropy_with_logits, you should reshape your tensor to be of shape [batch_size * n_labels, 2] for the logits and [batch_size * n_labels] for the targets (respectively [40, 2] and [40] in our case).

There may be a simpler to do this using some kind of binary cross-entropy loss but as I've said before, I don't work with TF.

yajian commented 5 years ago

How would I modify run_classifier.py in order to get it to do multi class classification? What I mean by that is an item in the dataset can belong to more than 1 class - not just one class. For example a sentence can be assigned multiple labels such as 'weather' and 'safety'. (i.e. the sentence is about weather and about safety).

change softmax layer to sigmoid layer, you can checkout https://github.com/yajian/bert.git

XSilverBullet commented 5 years ago

Why is the performance so low?

eval_accuracy = 0.05 eval_loss = 0.1684

YuMiaoTHU commented 5 years ago

Why is the performance so low?

eval_accuracy = 0.05 eval_loss = 0.1684

hey! I met the same problem!!! Have u find the solution? Thanks!

NancyLele commented 4 years ago

@yajian https://github.com/yajian/bert.git

I have a question about label. In classes.txt,the label is category . And in MultiLabelTextProcessor. the result from function get_labels is category not numeric(0 or 1). But in train dataset,the label is numeric.

yajian commented 4 years ago

@NancyLele the numeric label in dataset is from kaggle. classes.txt just shows the meaning of 0,1.