Closed wannaphong closed 4 years ago
Hi @wannaphong, thank you for your question.
I use the BMES tags (4 dims) and the special tokens (2 dims) the for word segmentation. The special tokens are the Start of Sentence (SOS) ans End of Sentence (EOS).
The CRF layer of the BiLSTM-CRF uses the transitions matrix to calculate tag dependencies. It means that the M tag frequently occurs after the B tag and the E tag frequently occurs after the M tag. To implement this function, it needs the special tokens (SOS and EOS). The special tokens represent that the B tag frequently occurs after the SOS token and the E tag frequently occurs next to EOS tag.
I refer to the BiLSTM-CRF model architecture https://www.aclweb.org/anthology/N16-1030/ and its original implementation https://github.com/glample/tagger/blob/1c9618889fb89500cc5e70c45c27859b89d44449/model.py#L285.
Thank you.
Thank you. 👍
from https://github.com/taishi-i/nagisa/blob/master/nagisa/model.py#L59, Why do you have 6 DIM outputs for word segmentation?
encode_ws
has 6 DIM outputs. I understand you using BMES (4 dim first). What are the last two boxes used for? Could you explain that, please?Thank you.