Open cvenour opened 5 years ago
An easy to do this would be to "plug" multiple classification heads (linear layers) on top of the core network performing a one-versus-rest binary classification task for each label.
@rodgzilla Should any changes be made to the loss function too?
If you have n
different labels, you should see the output of the model as n
binary classification outputs.
(disclaimer: I'm not used to work with TF, this might not be the proper way to do it)
Let's say you have a batch_size
of 10, 4 different labels and the input X
has label [True, False, True, False]
, meaning it belongs to the class 0
and 2
.
If you use the methodology that I described, the output of your model will be of shape [batch_size, n_labels, 2]
(in our case [10, 4, 2]
) and the target is of shape [batch_size, n_labels]
(in our case [10, 4]
).
To use tf.nn.sparse_softmax_cross_entropy_with_logits, you should reshape your tensor to be of shape [batch_size * n_labels, 2]
for the logits and [batch_size * n_labels]
for the targets (respectively [40, 2]
and [40]
in our case).
There may be a simpler to do this using some kind of binary cross-entropy loss but as I've said before, I don't work with TF.
How would I modify run_classifier.py in order to get it to do multi class classification? What I mean by that is an item in the dataset can belong to more than 1 class - not just one class. For example a sentence can be assigned multiple labels such as 'weather' and 'safety'. (i.e. the sentence is about weather and about safety).
change softmax layer to sigmoid layer, you can checkout https://github.com/yajian/bert.git
Why is the performance so low?
eval_accuracy = 0.05 eval_loss = 0.1684
Why is the performance so low?
eval_accuracy = 0.05 eval_loss = 0.1684
hey! I met the same problem!!! Have u find the solution? Thanks!
@yajian https://github.com/yajian/bert.git
I have a question about label. In classes.txt,the label is category . And in MultiLabelTextProcessor. the result from function get_labels is category not numeric(0 or 1). But in train dataset,the label is numeric.
@NancyLele the numeric label in dataset is from kaggle. classes.txt just shows the meaning of 0,1.
How would I modify run_classifier.py in order to get it to do multi class classification? What I mean by that is an item in the dataset can belong to more than 1 class - not just one class. For example a sentence can be assigned multiple labels such as 'weather' and 'safety'. (i.e. the sentence is about weather and about safety).