jiesutd / NCRFpp

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Apache License 2.0
1.88k stars 447 forks source link

f1 score is -1, pred_num = 0 #60

Closed precision2intelligence closed 6 years ago

precision2intelligence commented 6 years ago

The same issue as 22#. We use our dataset to train the NER model. The tag scheme is BIOES (The only difference is we used "M-" instead of "I-"). These data have been test on your "Lattice LSTM model". They can get accurate p,r,f1 value. So I am confused about this. Why our f1 score is -1 and pred_num = 0 on this model?

jiesutd commented 6 years ago

Please provide me with your log and sample data. Thanks.

precision2intelligence commented 6 years ago

Please find them in your e-mail! Thank you!

jiesutd commented 6 years ago

I checked your log file. There are three main problems:

  1. Based on your sample data, you are using the Chinese character-based model. In this case, the input word embeddings should be character embeddings. And you can set use_char=False as your basic unit is already characters.

  2. Your input word embeddings (should be your character embeddings) has a very large OOV (>99%), it will heavily affect your system performance.

  3. Your dev/test dataset is too small (100~200 sentences). It is not surprising that you got F1= -1 with the previous incorrect settings.

precision2intelligence commented 6 years ago

The word embedding is trained by segmenting the corpus into words. Such as "长江大桥". The char embedding is as "长", "江","大","桥". The word embedding is not needed at this time?

precision2intelligence commented 6 years ago

I have changed the word embedding as you said. The code can only find English characters. I print the embedding dict that match the corpus, and find only English characters are matched. But both the dataset and embedding files are Chinese. This lead to the large OOV. Is this tool only suit English corpus? Here is the log file. The embedding file is same with we used in your Lattice model. It can work well in that model.

Seed num: 42 MODEL: train Load pretrained word embedding, norm: False, dir: /home//newsingle50qing.txt k l n o p r s t u v w a b c y d z e f g q h ~ i ? j Embedding: pretrain word:1282, prefect match:26, case_match:0, oov:1950, oov%:0.986342943854 Training model... ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ DATA SUMMARY START: I/O: Tag scheme: BMES MAX SENTENCE LENGTH: -1 MAX WORD LENGTH: -1 Number normalized: True Word alphabet size: 1977 Char alphabet size: 100 Label alphabet size: 27 Word embedding dir: /home//newsingle50qing.txt Char embedding dir: None Word embedding size: 50 Char embedding size: 30 Norm word emb: False Norm char emb: False Train file directory: /home//train.txt Dev file directory: /home//dev.txt Test file directory: /home//test.txt Raw file directory: None Dset file directory: /home//model/ Model file directory: /home/*/model/ Loadmodel directory: None Decode file directory: None Train instance number: 20519 Dev instance number: 2125 Test instance number: 2320 Raw instance number: 0 FEATURE num: 0 ++++++++++++++++++++++++++++++++++++++++ Model Network: Model use_crf: True Model word extractor: LSTM Model use_char: False ++++++++++++++++++++++++++++++++++++++++ Training: Optimizer: SGD Iteration: 50 BatchSize: 16 Average batch loss: False ++++++++++++++++++++++++++++++++++++++++ Hyperparameters: Hyper lr: 0.015 Hyper lr_decay: 0.05 Hyper HP_clip: None Hyper momentum: 0.0 Hyper l2: 1e-08 Hyper hidden_dim: 200 Hyper dropout: 0.5 Hyper lstm_layer: 1 Hyper bilstm: True Hyper GPU: True DATA SUMMARY END. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ build network... use_char: False word feature extractor: LSTM use crf: True build word sequence feature extractor: LSTM... build word representation... build CRF... Epoch: 0/50 Learning rate is setted as: 0.015 Instance: 2000; Time: 36.69s; loss: 28537.5034; acc: 50014.0/51917.0=0.9633 Instance: 4000; Time: 38.72s; loss: 8965.4812; acc: 102685.0/105929.0=0.9694

jiesutd commented 6 years ago

It can be used in Chinese. My lattice LSTM have several embeddings, which one do you use? If you use the character embeddings, then I guess it may have some Character encoding mismatch in your train/dev/test data.

precision2intelligence commented 6 years ago

I have followed your comments. The f1 score is still -1. Please help us to find the problems. The log file is as follows.

Seed num: 42
MODEL: train
Load pretrained word embedding, norm: False, dir: /home/*/newsingle100qing.txt
Embedding:
     pretrain word:1282, prefect match:1282, case_match:0, oov:694, oov%:0.351036924633
Training model...
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DATA SUMMARY START:
 I/O:
     Tag          scheme: BMES
     MAX SENTENCE LENGTH: 250
     MAX   WORD   LENGTH: -1
     Number   normalized: True
     Word  alphabet size: 1977
     Char  alphabet size: 100
     Label alphabet size: 27
     Word embedding  dir: /home/*/newsingle100qing.txt
     Char embedding  dir: None
     Word embedding size: 100
     Char embedding size: 30
     Norm   word     emb: False
     Norm   char     emb: False
     Train  file directory: /home/*/train.txt
     Dev    file directory: /home/*/dev.txt
     Test   file directory: /home/*/test.txt
     Raw    file directory: None
     Dset   file directory: /home/*/model/
     Model  file directory: /home/*/model
     Loadmodel   directory: None
     Decode file directory: None
     Train instance number: 20390
     Dev   instance number: 2121
     Test  instance number: 2310
     Raw   instance number: 0
     FEATURE num: 0
 ++++++++++++++++++++++++++++++++++++++++
 Model Network:
     Model        use_crf: True
     Model word extractor: LSTM
     Model       use_char: False
 ++++++++++++++++++++++++++++++++++++++++
 Training:
     Optimizer: Adam
     Iteration: 50
     BatchSize: 16
     Average  batch   loss: False
 ++++++++++++++++++++++++++++++++++++++++
 Hyperparameters:
     Hyper              lr: 0.015
     Hyper        lr_decay: 0.05
     Hyper         HP_clip: None
     Hyper        momentum: 0.0
     Hyper              l2: 1e-08
     Hyper      hidden_dim: 200
     Hyper         dropout: 0.5
     Hyper      lstm_layer: 1
     Hyper          bilstm: True
     Hyper             GPU: True
DATA SUMMARY END.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
build network...
use_char:  False
word feature extractor:  LSTM
use crf:  True
build word sequence feature extractor: LSTM...
build word representation...
build CRF...
Epoch: 0/50
     Instance: 2000; Time: 140.87s; loss: 11691.8186; acc: 45882.0/47404.0=0.9679
     Instance: 4000; Time: 140.68s; loss: 7558.7781; acc: 88926.0/91768.0=0.9690
     Instance: 6000; Time: 128.61s; loss: 5567.9442; acc: 135566.0/139719.0=0.9703
     Instance: 8000; Time: 125.62s; loss: 4629.0814; acc: 181789.0/187174.0=0.9712
     Instance: 10000; Time: 120.99s; loss: 4022.0119; acc: 230511.0/237038.0=0.9725
     Instance: 12000; Time: 153.70s; loss: 4401.6414; acc: 278995.0/286922.0=0.9724
     Instance: 14000; Time: 133.53s; loss: 4090.0612; acc: 323919.0/333180.0=0.9722
     Instance: 16000; Time: 152.85s; loss: 4675.8454; acc: 372036.0/382784.0=0.9719
     Instance: 18000; Time: 123.16s; loss: 4129.2788; acc: 418833.0/430928.0=0.9719
     Instance: 20000; Time: 121.94s; loss: 3834.2694; acc: 466650.0/480024.0=0.9721
     Instance: 20390; Time: 30.99s; loss: 867.0348; acc: 476172.0/489836.0=0.9721
Epoch: 0 training finished. Time: 1372.94s, speed: 14.85st/s,  total loss: 55467.7651672
totalloss: 55467.7651672
gold_num =  424  pred_num =  0  right_num =  0
Dev: time: 35.98s, speed: 59.17st/s; acc: 0.9729, p: -1.0000, r: 0.0000, f: -1.0000
Exceed previous best f score: -10
Save current best model in file: /home/yangqiuxia/WCNNNCRFpp/model.0.model
gold_num =  472  pred_num =  0  right_num =  0
Test: time: 43.23s, speed: 53.74st/s; acc: 0.9767, p: -1.0000, r: 0.0000, f: -1.0000
jiesutd commented 6 years ago

You can find that the token accuracy has been largely improved from ~80% to 97%. Your result is the first iteration, you can use more iterations to give a better result.

jiesutd commented 6 years ago

And it is strange that your dev/test dataset have more than 2000 sentences but only contain ~400 entities. This means that your entity is too little, which is difficult to identify by system.

precision2intelligence commented 6 years ago

We waited for the result after enough epochs and this phenomenon is still existed. We used the same dataset in your Lattice model, it worked well. Besides, the dataset also worked well on other DNN models. So I have no idea of this. Is there any clue?

jiesutd commented 6 years ago

Then it may be the unbalanced label problem. As your dataset has very less entities, model will ignore the less labels. You can find some solutions for unbalanced data, but they are not perfect. It is a typical problem in real application.

precision2intelligence commented 6 years ago

It may not be the unbalanced label problem. We have changed another dataset, but the phenomenon is still exist. Can you help us? Here is the log file: Seed num: 42 MODEL: train Load pretrained word embedding, norm: False, dir: /home//.model.txt Embedding: pretrain word:331830, prefect match:3730, case_match:0, oov:673, oov%:0.152815622162 Training model... ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ DATA SUMMARY START: I/O: Tag scheme: BIO MAX SENTENCE LENGTH: 250 MAX WORD LENGTH: -1 Number normalized: True Word alphabet size: 4404 Char alphabet size: 105 Label alphabet size: 8 Word embedding dir: /home//model.txt Char embedding dir: None Word embedding size: 100 Char embedding size: 30 Norm word emb: False Norm char emb: False Train file directory: /home//my.train Dev file directory: /home//my.dev Test file directory: /home//my.test Raw file directory: None Dset file directory: /home//model/ Model file directory: /home/*/model Loadmodel directory: None Decode file directory: None Train instance number: 16628 Dev instance number: 4208 Test instance number: 4628 Raw instance number: 0 FEATURE num: 0 ++++++++++++++++++++++++++++++++++++++++ Model Network: Model use_crf: True Model word extractor: LSTM Model use_char: False ++++++++++++++++++++++++++++++++++++++++ Training: Optimizer: SGD Iteration: 50 BatchSize: 16 Average batch loss: False ++++++++++++++++++++++++++++++++++++++++ Hyperparameters: Hyper lr: 0.015 Hyper lr_decay: 0.05 Hyper HP_clip: None Hyper momentum: 0.0 Hyper l2: 1e-08 Hyper hidden_dim: 200 Hyper dropout: 0.5 Hyper lstm_layer: 1 Hyper bilstm: True Hyper GPU: True DATA SUMMARY END. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ build network... use_char: False word feature extractor: LSTM use crf: True build word sequence feature extractor: LSTM... build word representation... build CRF... Epoch: 0/50 Learning rate is setted as: 0.015 Instance: 2000; Time: 62.56s; loss: 1508746.1278; acc: 73698.0/93881.0=0.7850 Instance: 4000; Time: 55.43s; loss: 1649292.3662; acc: 148086.0/187737.0=0.7888 Instance: 6000; Time: 46.77s; loss: 1349588.6406; acc: 221823.0/280404.0=0.7911 Instance: 8000; Time: 50.71s; loss: 1488015.7256; acc: 294780.0/373006.0=0.7903 Instance: 10000; Time: 51.54s; loss: 1311897.2812; acc: 369806.0/467554.0=0.7909 Instance: 12000; Time: 46.40s; loss: 1151453.0967; acc: 443310.0/559870.0=0.7918 Instance: 14000; Time: 44.21s; loss: 1001844.0859; acc: 518360.0/653931.0=0.7927 Instance: 16000; Time: 46.27s; loss: 1094421.0596; acc: 589905.0/744602.0=0.7922 Instance: 16628; Time: 13.23s; loss: 282753.6875; acc: 612711.0/773637.0=0.7920 Epoch: 0 training finished. Time: 417.13s, speed: 39.86st/s, total loss: 10838012.0712 totalloss: 10838012.0712 gold_num = 6540 pred_num = 0 right_num = 0 Dev: time: 34.64s, speed: 123.72st/s; acc: 0.8918, p: -1.0000, r: 0.0000, f: -1.0000 Exceed previous best f score: -10 Save current best model in file: /home/yangqiuxia/WCNNNCRFpp/model.0.model gold_num = 7455 pred_num = 4 right_num = 0 Test: time: 35.72s, speed: 130.93st/s; acc: 0.8866, p: 0.0000, r: 0.0000, f: -1.0000 Epoch: 1/50 Learning rate is setted as: 0.0142857142857 Instance: 2000; Time: 49.18s; loss: 1157811.4463; acc: 74168.0/94638.0=0.7837 Instance: 4000; Time: 43.51s; loss: 797654.6699; acc: 147283.0/185909.0=0.7922 Instance: 6000; Time: 44.03s; loss: 950858.7529; acc: 221901.0/280555.0=0.7909 Instance: 8000; Time: 42.80s; loss: 985196.9434; acc: 294712.0/373338.0=0.7894 Instance: 10000; Time: 41.63s; loss: 820030.0908; acc: 368879.0/466311.0=0.7911 Instance: 12000; Time: 42.12s; loss: 907943.2744; acc: 442761.0/558860.0=0.7923 Instance: 14000; Time: 45.02s; loss: 866118.1777; acc: 516136.0/651572.0=0.7921 Instance: 16000; Time: 39.56s; loss: 874607.4980; acc: 590281.0/744204.0=0.7932 Instance: 16628; Time: 13.17s; loss: 321950.6758; acc: 613311.0/773637.0=0.7928 Epoch: 1 training finished. Time: 361.01s, speed: 46.06st/s, total loss: 7682171.5293 totalloss: 7682171.5293 gold_num = 6540 pred_num = 7 right_num = 0 Dev: time: 32.92s, speed: 130.17st/s; acc: 0.8918, p: 0.0000, r: 0.0000, f: -1.0000 gold_num = 7455 pred_num = 12 right_num = 0 Test: time: 38.22s, speed: 123.80st/s; acc: 0.8865, p: 0.0000, r: 0.0000, f: -1.0000 Epoch: 2/50 Learning rate is setted as: 0.0136363636364 Instance: 2000; Time: 62.19s; loss: 802904.1162; acc: 75197.0/94685.0=0.7942 Instance: 4000; Time: 61.69s; loss: 797079.1963; acc: 150347.0/188708.0=0.7967 Instance: 6000; Time: 44.54s; loss: 780471.0654; acc: 224016.0/281337.0=0.7963 Instance: 8000; Time: 48.63s; loss: 797713.7139; acc: 296610.0/374062.0=0.7929 Instance: 10000; Time: 57.54s; loss: 780577.4780; acc: 369560.0/466573.0=0.7921 Instance: 12000; Time: 54.08s; loss: 855878.4482; acc: 443217.0/560055.0=0.7914 Instance: 14000; Time: 45.68s; loss: 593563.6289; acc: 517002.0/651865.0=0.7931 Instance: 16000; Time: 41.91s; loss: 782002.7080; acc: 590147.0/744541.0=0.7926 Instance: 16628; Time: 13.48s; loss: 205705.3845; acc: 613215.0/773637.0=0.7926 Epoch: 2 training finished. Time: 429.74s, speed: 38.69st/s, total loss: 6395895.7395 totalloss: 6395895.7395 gold_num = 6540 pred_num = 17 right_num = 0 Dev: time: 43.97s, speed: 96.40st/s; acc: 0.8911, p: 0.0000, r: 0.0000, f: -1.0000 gold_num = 7455 pred_num = 25 right_num = 0 Test: time: 48.79s, speed: 96.24st/s; acc: 0.8849, p: 0.0000, r: 0.0000, f: -1.0000 Epoch: 3/50 Learning rate is setted as: 0.0130434782609 Instance: 2000; Time: 59.16s; loss: 651076.3457; acc: 73847.0/93889.0=0.7865 Instance: 4000; Time: 59.52s; loss: 645221.7754; acc: 146804.0/185729.0=0.7904 Instance: 6000; Time: 54.23s; loss: 580582.8809; acc: 222166.0/280522.0=0.7920 Instance: 8000; Time: 58.80s; loss: 550626.7510; acc: 294617.0/371781.0=0.7924 Instance: 10000; Time: 48.71s; loss: 534637.1660; acc: 367575.0/462536.0=0.7947 Instance: 12000; Time: 53.63s; loss: 589357.3496; acc: 441536.0/555878.0=0.7943 Instance: 14000; Time: 47.10s; loss: 637078.3643; acc: 517582.0/650469.0=0.7957 Instance: 16000; Time: 52.80s; loss: 651482.7441; acc: 590662.0/743960.0=0.7939 Instance: 16628; Time: 19.69s; loss: 192460.6211; acc: 614139.0/773637.0=0.7938 Epoch: 3 training finished. Time: 453.64s, speed: 36.65st/s, total loss: 5032523.99805 totalloss: 5032523.99805 gold_num = 6540 pred_num = 0 right_num = 0 Dev: time: 40.17s, speed: 106.54st/s; acc: 0.8918, p: -1.0000, r: 0.0000, f: -1.0000 gold_num = 7455 pred_num = 0 right_num = 0 Test: time: 46.26s, speed: 100.89st/s; acc: 0.8866, p: -1.0000, r: 0.0000, f: -1.0000 Epoch: 4/50 Learning rate is setted as: 0.0125

jiesutd commented 6 years ago

Your loss exploded (loss: 1508746.1278). You can try to set the ave_batch_loss=True first.