ThilinaRajapakse / pytorch-transformers-classification

Based on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, RoBERTa, and XLM models for text classification.
Apache License 2.0
306 stars 97 forks source link

Not An issue- Adding metadata #30

Open pythonometrist opened 5 years ago

pythonometrist commented 5 years ago

Hello again - I was wondering if you might have pointers on how to nincorporate metadata with the text. I think I am good with adding a custom layer on top of bert. I think I need to to figure out how to generate each example so a part of it goes to bert and the rest to the other layers on top of bert. Any ideas? Thanks as always.

pythonometrist commented 5 years ago

Basically, I could for example add rating (as a continuous measure) as metadata. Then 0-3 refer to the batch inputs for bert, and the next one could be rating. The problem is in the forward method how do i split out the in out into the first four elements for BERT and the next one for the additional regression layer.

ThilinaRajapakse commented 5 years ago

How about the approach here (ExtraBertMultiClassifier)?

class ExtraBertMultiClassifier(nn.Module):
    def __init__(self, bert_model_path, labels_count, hidden_dim=768, mlp_dim=100, extras_dim=6, dropout=0.1):
        super().__init__()

        self.config = {
            'bert_model_path': bert_model_path,
            'labels_count': labels_count,
            'hidden_dim': hidden_dim,
            'mlp_dim': mlp_dim,
            'extras_dim': extras_dim,
            'dropout': dropout,
        }

        self.bert = BertModel.from_pretrained(bert_model_path)
        self.dropout = nn.Dropout(dropout)
        self.mlp = nn.Sequential(
            nn.Linear(hidden_dim + extras_dim, mlp_dim),
            nn.ReLU(),
            nn.Linear(mlp_dim, mlp_dim),
            # nn.ReLU(),
            # nn.Linear(mlp_dim, mlp_dim),
            nn.ReLU(),            
            nn.Linear(mlp_dim, labels_count)
        )
        # self.sigmoid = nn.Sigmoid()
        self.softmax = nn.Softmax()

    def forward(self, tokens, masks, extras):
        _, pooled_output = self.bert(tokens, attention_mask=masks, output_all_encoded_layers=False)
        dropout_output = self.dropout(pooled_output)

        concat_output = torch.cat((dropout_output, extras), dim=1)
        mlp_output = self.mlp(concat_output)
        # proba = self.sigmoid(mlp_output)
        proba = self.softmax(mlp_output)

        return proba

It's from this paper.

pythonometrist commented 5 years ago

Thanks ! let me check it out.