jiesutd / NCRFpp

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Apache License 2.0
1.89k stars 446 forks source link

Decode error with nbest=0 when not using CRF #179

Open jd-coderepos opened 3 years ago

jd-coderepos commented 3 years ago

Thank you very much for response with https://github.com/jiesutd/NCRFpp/issues/176. I tried the suggestion of using nbest=0 with two different types of models and see errors. Happy to have your recommendations to resolve them. Thank you. Please find the experiment run logs below.

Seed num: 42
MODEL: decode
../drive/MyDrive/experiments/test.data
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
 I/O:
     Start   Sequence   Laebling   task...
     Tag          scheme: BMES
     Split         token:  ||| 
     MAX SENTENCE LENGTH: 250
     MAX   WORD   LENGTH: -1
     Number   normalized: True
     Word  alphabet size: 25612
     Char  alphabet size: 80
     Label alphabet size: 30
     Word embedding  dir: None
     Char embedding  dir: None
     Word embedding size: 50
     Char embedding size: 30
     Norm   word     emb: False
     Norm   char     emb: False
     Train  file directory: ../drive/MyDrive/experiments/train.data
     Dev    file directory: ../drive/MyDrive/experiments/dev.data
     Test   file directory: ../drive/MyDrive/experiments/test.data
     Raw    file directory: ../drive/MyDrive/experiments/test.data
     Dset   file directory: ../drive/MyDrive/experiments/ccnn-wbilstm/ccnn-wbilstm.dset
     Model  file directory: ../drive/MyDrive/experiments/ccnn-wbilstm
     Loadmodel   directory: ../drive/MyDrive/experiments/ccnn-wbilstm/ccnn-wbilstm.17.model
     Decode file directory: ../drive/MyDrive/experiments/test.out
     Train instance number: 20200
     Dev   instance number: 1142
     Test  instance number: 9700
     Raw   instance number: 0
     FEATURE num: 0
 ++++++++++++++++++++++++++++++++++++++++
 Model Network:
     Model        use_crf: False
     Model word extractor: LSTM
     Model       use_char: True
     Model char extractor: CNN
     Model char_hidden_dim: 50
 ++++++++++++++++++++++++++++++++++++++++
 Training:
     Optimizer: SGD
     Iteration: 50
     BatchSize: 10
     Average  batch   loss: False
 ++++++++++++++++++++++++++++++++++++++++
 Hyperparameters:
     Hyper              lr: 0.015
     Hyper        lr_decay: 0.05
     Hyper         HP_clip: None
     Hyper        momentum: 0.0
     Hyper              l2: 1e-08
     Hyper      hidden_dim: 200
     Hyper         dropout: 0.5
     Hyper      lstm_layer: 2
     Hyper          bilstm: True
     Hyper             GPU: False
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: 0
Load Model from file:  ../drive/MyDrive/experiments/ccnn-wbilstm
build sequence labeling network...
use_char:  True
char feature extractor:  CNN
word feature extractor:  LSTM
use crf:  False
build word sequence feature extractor: LSTM...
build word representation...
build char sequence feature extractor: CNN ...
Decode raw data, nbest: 0 ...
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:652: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool1d(input, kernel_size, stride, padding, dilation, ceil_mode)
Right token =  15182  All token =  74727  acc =  0.20316619160410562
raw: time:8.27s, speed:1185.96st/s; acc: 0.2032, p: 0.0000, r: -1.0000, f: -1.0000
Traceback (most recent call last):
  File "main.py", line 568, in <module>
    data.write_decoded_results(decode_results, 'raw')
  File "/content/NCRFpp/utils/data.py", line 334, in write_decoded_results
    fout.write(content_list[idx][0][idy].encode('utf-8') + " " + predict_results[idx][idy] + '\n')
TypeError: can't concat str to bytes
Seed num: 42
MODEL: decode
../drive/MyDrive/experiments/test.data
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
 I/O:
     Start   Sequence   Laebling   task...
     Tag          scheme: BMES
     Split         token:  ||| 
     MAX SENTENCE LENGTH: 250
     MAX   WORD   LENGTH: -1
     Number   normalized: True
     Word  alphabet size: 25612
     Char  alphabet size: 80
     Label alphabet size: 30
     Word embedding  dir: None
     Char embedding  dir: None
     Word embedding size: 50
     Char embedding size: 30
     Norm   word     emb: False
     Norm   char     emb: False
     Train  file directory: ../drive/MyDrive/experiments/train.data
     Dev    file directory: ../drive/MyDrive/experiments/dev.data
     Test   file directory: ../drive/MyDrive/experiments/test.data
     Raw    file directory: ../drive/MyDrive/experiments/test.data
     Dset   file directory: ../drive/MyDrive/experiments/wbilstm/wbilstm.dset
     Model  file directory: ../drive/MyDrive/experiments/wbilstm
     Loadmodel   directory: ../drive/MyDrive/experiments/wbilstm/wbilstm.7.model
     Decode file directory: ../drive/MyDrive/experiments/test.out
     Train instance number: 20200
     Dev   instance number: 1142
     Test  instance number: 9700
     Raw   instance number: 0
     FEATURE num: 0
 ++++++++++++++++++++++++++++++++++++++++
 Model Network:
     Model        use_crf: False
     Model word extractor: LSTM
     Model       use_char: False
 ++++++++++++++++++++++++++++++++++++++++
 Training:
     Optimizer: SGD
     Iteration: 50
     BatchSize: 10
     Average  batch   loss: False
 ++++++++++++++++++++++++++++++++++++++++
 Hyperparameters:
     Hyper              lr: 0.015
     Hyper        lr_decay: 0.05
     Hyper         HP_clip: None
     Hyper        momentum: 0.0
     Hyper              l2: 1e-08
     Hyper      hidden_dim: 200
     Hyper         dropout: 0.5
     Hyper      lstm_layer: 2
     Hyper          bilstm: True
     Hyper             GPU: False
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
nbest: 0
Load Model from file:  ../drive/MyDrive/experiments/wbilstm
build sequence labeling network...
use_char:  False
word feature extractor:  LSTM
use crf:  False
build word sequence feature extractor: LSTM...
build word representation...
Decode raw data, nbest: 0 ...
Right token =  14289  All token =  74727  acc =  0.19121602633586254
raw: time:11.07s, speed:885.59st/s; acc: 0.1912, p: 0.0000, r: -1.0000, f: -1.0000
Traceback (most recent call last):
  File "main.py", line 568, in <module>
    data.write_decoded_results(decode_results, 'raw')
  File "/content/NCRFpp/utils/data.py", line 334, in write_decoded_results
    fout.write(content_list[idx][0][idy].encode('utf-8') + " " + predict_results[idx][idy] + '\n')
TypeError: can't concat str to bytes