Closed gyc913 closed 5 years ago
They are added together in our paper. Please refer to equation 13.
i understand now , i am so sorry
sorry,I got another question since the batchsize is set to 1, so there is no padding process, then why the label size is still bigger than the real label size:
(Pdb) self.label_alphabet.size()
6
(Pdb) self.label_alphabet.instances
['B-NAME', 'E-NAME', 'O', 'M-NAME', 'S-NAME'] is actually 5
Besides, in class BiLSTM_CRF, why the label size is set as follow? data.label_alphabet_size += 2
Thank you
sorry,I got another question since the batchsize is set to 1, so there is no padding process, then why the label size is still bigger than the real label size: (Pdb) self.label_alphabet.size() 6 (Pdb) self.label_alphabet.instances ['B-NAME', 'E-NAME', 'O', 'M-NAME', 'S-NAME'] is actually 5
Besides, in class BiLSTM_CRF, why the label size is set as follow? data.label_alphabet_size += 2
Thank you
The extra label-alphabet is useless, as I use the same alphabet class to represent the word/character and labels. For the word/character, sometimes we need to set an unknown
token. So the label also follows the similar format. The 'unknown' label does not affect the results.
For CRF, we need to add extra two label START
and END
. If you understand the CRF structure, you will know why we need the START
and END
token during the inference.
Sorry that I am new to pytorch , but here in the class WordLSTMCell ,I found that
f, i, g = torch.split(wh_b + wi, split_size=self.hidden_size, dim=1)
In the formula of your paper, wh_b and wi are not added , so Did I misunderstand your code?
def forward(self, input, hx): """ Args: input: A (batch, input_size) tensor containing input features. hx: A tuple (h_0, c_0), which contains the initial hidden and cell state, where the size of both states is (batch, hidden_size). Returns: h_1, c_1: Tensors containing the next hidden and cell state. """