kmkurn / pytorch-crf

(Linear-chain) Conditional random field in PyTorch.
https://pytorch-crf.readthedocs.io
MIT License
935 stars 151 forks source link

Should I apply softmax or log_softmax to my token_scores? #108

Closed poteminr closed 1 year ago

poteminr commented 1 year ago
embedded_text_input = self.encoder(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state
embedded_text_input = self.dropout(F.leaky_relu(embedded_text_input))
 token_scores = F.log_softmax(self.feedforward(embedded_text_input), dim=-1)
 # or
 # token_scores = self.feedforward(embedded_text_input)

loss, output_tags = self.apply_crf(token_scores, labels, attention_mask, batch_size=batch_size)

Should I apply softmax before passing token_scores to the CRF?

kmkurn commented 1 year ago

No you don’t have to. The CRF is already a (very large) softmax over the possible tag sequences.On 2 Dec 2022, at 4:42 am, Roman Potemin @.***> wrote: embedded_text_input = self.encoder(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state embedded_text_input = self.dropout(F.leaky_relu(embedded_text_input)) token_scores = F.log_softmax(self.feedforward(embedded_text_input), dim=-1)

or

token_scores = self.feedforward(embedded_text_input)

loss, output_tags = self.apply_crf(token_scores, labels, attention_mask, batch_size=batch_size)

Should I apply softmax before passing token_scores to the CRF?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

poteminr commented 1 year ago

Thank you!