NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Thank you very much for response with https://github.com/jiesutd/NCRFpp/issues/176. I tried the suggestion of using nbest=0 with two different types of models and see errors. Happy to have your recommendations to resolve them. Thank you. Please find the experiment run logs below.
Attempt 1: ccnn-wbilstm model
Seed num: 42
MODEL: decode
../drive/MyDrive/experiments/test.data
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: BMES
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 25612
Char alphabet size: 80
Label alphabet size: 30
Word embedding dir: None
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: ../drive/MyDrive/experiments/train.data
Dev file directory: ../drive/MyDrive/experiments/dev.data
Test file directory: ../drive/MyDrive/experiments/test.data
Raw file directory: ../drive/MyDrive/experiments/test.data
Dset file directory: ../drive/MyDrive/experiments/ccnn-wbilstm/ccnn-wbilstm.dset
Model file directory: ../drive/MyDrive/experiments/ccnn-wbilstm
Loadmodel directory: ../drive/MyDrive/experiments/ccnn-wbilstm/ccnn-wbilstm.17.model
Decode file directory: ../drive/MyDrive/experiments/test.out
Train instance number: 20200
Dev instance number: 1142
Test instance number: 9700
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: False
Model word extractor: LSTM
Model use_char: True
Model char extractor: CNN
Model char_hidden_dim: 50
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 50
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 2
Hyper bilstm: True
Hyper GPU: False
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: 0
Load Model from file: ../drive/MyDrive/experiments/ccnn-wbilstm
build sequence labeling network...
use_char: True
char feature extractor: CNN
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
Decode raw data, nbest: 0 ...
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:652: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool1d(input, kernel_size, stride, padding, dilation, ceil_mode)
Right token = 15182 All token = 74727 acc = 0.20316619160410562
raw: time:8.27s, speed:1185.96st/s; acc: 0.2032, p: 0.0000, r: -1.0000, f: -1.0000
Traceback (most recent call last):
File "main.py", line 568, in <module>
data.write_decoded_results(decode_results, 'raw')
File "/content/NCRFpp/utils/data.py", line 334, in write_decoded_results
fout.write(content_list[idx][0][idy].encode('utf-8') + " " + predict_results[idx][idy] + '\n')
TypeError: can't concat str to bytes
Attempt 2: wbilstm model
Seed num: 42
MODEL: decode
../drive/MyDrive/experiments/test.data
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
I/O:
Start Sequence Laebling task...
Tag scheme: BMES
Split token: |||
MAX SENTENCE LENGTH: 250
MAX WORD LENGTH: -1
Number normalized: True
Word alphabet size: 25612
Char alphabet size: 80
Label alphabet size: 30
Word embedding dir: None
Char embedding dir: None
Word embedding size: 50
Char embedding size: 30
Norm word emb: False
Norm char emb: False
Train file directory: ../drive/MyDrive/experiments/train.data
Dev file directory: ../drive/MyDrive/experiments/dev.data
Test file directory: ../drive/MyDrive/experiments/test.data
Raw file directory: ../drive/MyDrive/experiments/test.data
Dset file directory: ../drive/MyDrive/experiments/wbilstm/wbilstm.dset
Model file directory: ../drive/MyDrive/experiments/wbilstm
Loadmodel directory: ../drive/MyDrive/experiments/wbilstm/wbilstm.7.model
Decode file directory: ../drive/MyDrive/experiments/test.out
Train instance number: 20200
Dev instance number: 1142
Test instance number: 9700
Raw instance number: 0
FEATURE num: 0
++++++++++++++++++++++++++++++++++++++++
Model Network:
Model use_crf: False
Model word extractor: LSTM
Model use_char: False
++++++++++++++++++++++++++++++++++++++++
Training:
Optimizer: SGD
Iteration: 50
BatchSize: 10
Average batch loss: False
++++++++++++++++++++++++++++++++++++++++
Hyperparameters:
Hyper lr: 0.015
Hyper lr_decay: 0.05
Hyper HP_clip: None
Hyper momentum: 0.0
Hyper l2: 1e-08
Hyper hidden_dim: 200
Hyper dropout: 0.5
Hyper lstm_layer: 2
Hyper bilstm: True
Hyper GPU: False
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: 0
Load Model from file: ../drive/MyDrive/experiments/wbilstm
build sequence labeling network...
use_char: False
word feature extractor: LSTM
use crf: False
build word sequence feature extractor: LSTM...
build word representation...
Decode raw data, nbest: 0 ...
Right token = 14289 All token = 74727 acc = 0.19121602633586254
raw: time:11.07s, speed:885.59st/s; acc: 0.1912, p: 0.0000, r: -1.0000, f: -1.0000
Traceback (most recent call last):
File "main.py", line 568, in <module>
data.write_decoded_results(decode_results, 'raw')
File "/content/NCRFpp/utils/data.py", line 334, in write_decoded_results
fout.write(content_list[idx][0][idy].encode('utf-8') + " " + predict_results[idx][idy] + '\n')
TypeError: can't concat str to bytes
Thank you very much for response with https://github.com/jiesutd/NCRFpp/issues/176. I tried the suggestion of using nbest=0 with two different types of models and see errors. Happy to have your recommendations to resolve them. Thank you. Please find the experiment run logs below.