It s me again, I opened a new issue because I tried to add a BILSTM layer to see if it ganna enhance the performance or not
Actually the enhancement was not impressive ~2% between two models
I wanted to share with u how I did to add the BILSTM layer and see if I did it correctly or not because I am a newbie in deep learning and I am trying to make my first model for a project study
class Bert_Bilstm_CRF(BertPreTrainedModel):
def __init__(self, config):
super(Bert_CRF, self).__init__(config)
self.num_labels = config.num_labels
self.bert = BertModel(config)
self.rnn = nn.LSTM(bidirectional=True, num_layers=2, input_size=768, hidden_size=768//2, batch_first=True)
self.dropout = nn.Dropout(config.hidden_dropout_prob)
self.classifier = nn.Linear(config.hidden_size, self.num_labels)
self.crf = CRF(self.num_labels, batch_first=True)
def forward(self, input_ids, attn_masks, labels=None): # dont confuse this with _forward_alg above.
outputs = self.bert(input_ids, attn_masks)
sequence_output = outputs[0]
enc, _ = self.rnn(sequence_output)
sequence_output = self.dropout(enc)
emission = self.classifier(sequence_output)
attn_masks = attn_masks.type(torch.uint8)
if labels is not None:
loss = -self.crf(log_soft(emission, 2), labels, mask=attn_masks, reduction='mean')
return loss
prediction = self.crf.decode(emission, mask=attn_masks)
return prediction
It s me again, I opened a new issue because I tried to add a BILSTM layer to see if it ganna enhance the performance or not Actually the enhancement was not impressive ~2% between two models
I wanted to share with u how I did to add the BILSTM layer and see if I did it correctly or not because I am a newbie in deep learning and I am trying to make my first model for a project study