Closed EricAugust closed 5 years ago
This is because your model is not trained well. You can see the acc
value for reference. You can share your log file for more suggestions.
Seed num: 42 MODEL: train Training model... ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ DATA SUMMARY START: I/O: Tag scheme: BIO MAX SENTENCE LENGTH: 250 MAX WORD LENGTH: -1 Number normalized: True Word alphabet size: 93367 Char alphabet size: 5111 Label alphabet size: 17 Word embedding dir: None Char embedding dir: None Word embedding size: 300 Char embedding size: 300 Norm word emb: False Norm char emb: False Train file directory: data/example.train Dev file directory: data/example.dev Test file directory: data/example.test Raw file directory: None Dset file directory: None Model file directory: save/lstmcrf Loadmodel directory: None Decode file directory: None Train instance number: 48756 Dev instance number: 11601 Test instance number: 5980 Raw instance number: 0 FEATURE num: 0 ++++++++++++++++++++++++++++++++++++++++ Model Network: Model use_crf: True Model word extractor: LSTM Model use_char: False ++++++++++++++++++++++++++++++++++++++++ Training: Optimizer: SGD Iteration: 20 BatchSize: 128 Average batch loss: True ++++++++++++++++++++++++++++++++++++++++ Hyperparameters: Hyper lr: 0.015 Hyper lr_decay: 0.05 Hyper HP_clip: None Hyper momentum: 0.0 Hyper l2: 1e-08 Hyper hidden_dim: 200 Hyper dropout: 0.5 Hyper lstm_layer: 1 Hyper bilstm: True Hyper GPU: True DATA SUMMARY END. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ build network... use_char: False word feature extractor: LSTM use crf: True build word sequence feature extractor: LSTM... build word representation... build CRF... Epoch: 0/20 Learning rate is set as: 0.015 Instance: 16000; Time: 16.67s; loss: 1257.2989; acc: 361901.0/388648.0=0.9312 Instance: 32000; Time: 16.45s; loss: 1017.1760; acc: 728406.0/779934.0=0.9339 Instance: 48000; Time: 16.75s; loss: 976.8528; acc: 1092761.0/1168973.0=0.9348 Instance: 48756; Time: 0.83s; loss: 49.5680; acc: 1110536.0/1188013.0=0.9348 Epoch: 0 training finished. Time: 50.71s, speed: 961.49st/s, total loss: 3300.8956022262573 totalloss: 3300.8956022262573 gold_num = 6062 pred_num = 0 right_num = 0 Dev: time: 12.06s, speed: 972.83st/s; acc: 0.9253, p: -1.0000, r: 0.0000, f: -1.0000 Exceed previous best f score: -10 Save current best model in file: save/lstmcrf.0.model gold_num = 0 pred_num = 0 right_num = 0 Test: time: 9.31s, speed: 655.88st/s; acc: 0.8924, p: -1.0000, r: -1.0000, f: -1.0000 Epoch: 1/20 Learning rate is set as: 0.014285714285714285 Instance: 16000; Time: 17.53s; loss: 958.0479; acc: 364878.0/389743.0=0.9362 Instance: 32000; Time: 17.92s; loss: 940.4033; acc: 730714.0/780416.0=0.9363 Instance: 48000; Time: 17.93s; loss: 931.0096; acc: 1094975.0/1169534.0=0.9362 Instance: 48756; Time: 0.90s; loss: 42.4152; acc: 1112360.0/1188013.0=0.9363 Epoch: 1 training finished. Time: 54.28s, speed: 898.28st/s, total loss: 2871.875850200653 totalloss: 2871.875850200653 gold_num = 6062 pred_num = 0 right_num = 0 Dev: time: 12.39s, speed: 947.35st/s; acc: 0.9253, p: -1.0000, r: 0.0000, f: -1.0000 gold_num = 0 pred_num = 0 right_num = 0 Test: time: 9.54s, speed: 631.60st/s; acc: 0.8924, p: -1.0000, r: -1.0000, f: -1.0000 Epoch: 2/20 Learning rate is set as: 0.013636363636363634 Instance: 16000; Time: 17.90s; loss: 916.0634; acc: 364379.0/389213.0=0.9362 Instance: 32000; Time: 16.73s; loss: 896.1602; acc: 730395.0/779832.0=0.9366 Instance: 48000; Time: 17.18s; loss: 900.7790; acc: 1095124.0/1169608.0=0.9363 Instance: 48756; Time: 0.81s; loss: 42.6174; acc: 1112360.0/1188013.0=0.9363 Epoch: 2 training finished. Time: 52.62s, speed: 926.63st/s, total loss: 2755.6200108528137 totalloss: 2755.6200108528137 gold_num = 6062 pred_num = 0 right_num = 0 Dev: time: 12.16s, speed: 964.57st/s; acc: 0.9253, p: -1.0000, r: 0.0000, f: -1.0000 gold_num = 0 pred_num = 0 right_num = 0 Test: time: 9.27s, speed: 650.77st/s; acc: 0.8924, p: -1.0000, r: -1.0000, f: -1.0000 Epoch: 3/20 Learning rate is set as: 0.013043478260869566 Instance: 16000; Time: 16.73s; loss: 863.2110; acc: 363875.0/388137.0=0.9375 Instance: 32000; Time: 16.79s; loss: 870.7078; acc: 727458.0/776722.0=0.9366 Instance: 48000; Time: 17.74s; loss: 864.7939; acc: 1094755.0/1169257.0=0.9363 Instance: 48756; Time: 0.83s; loss: 40.6792; acc: 1112360.0/1188013.0=0.9363 Epoch: 3 training finished. Time: 52.09s, speed: 936.05st/s, total loss: 2639.39190864563 totalloss: 2639.39190864563 gold_num = 6062 pred_num = 0 right_num = 0 Dev: time: 12.30s, speed: 954.03st/s; acc: 0.9253, p: -1.0000, r: 0.0000, f: -1.0000 gold_num = 0 pred_num = 0 right_num = 0 Test: time: 9.24s, speed: 652.88st/s; acc: 0.8924, p: -1.0000, r: -1.0000, f: -1.0000 Epoch: 4/20 Learning rate is set as: 0.0125 Instance: 16000; Time: 17.91s; loss: 848.1224; acc: 364788.0/389921.0=0.9355 Instance: 32000; Time: 17.35s; loss: 835.4852; acc: 729310.0/779431.0=0.9357 Instance: 48000; Time: 17.29s; loss: 811.4224; acc: 1094940.0/1169539.0=0.9362 Instance: 48756; Time: 0.81s; loss: 36.5612; acc: 1112356.0/1188013.0=0.9363 Epoch: 4 training finished. Time: 53.36s, speed: 913.73st/s, total loss: 2531.5912747383118 totalloss: 2531.5912747383118 gold_num = 6062 pred_num = 2 right_num = 0 Dev: time: 12.28s, speed: 955.89st/s; acc: 0.9253, p: 0.0000, r: 0.0000, f: -1.0000 gold_num = 0 pred_num = 0 right_num = 0 Test: time: 9.38s, speed: 642.48st/s; acc: 0.8924, p: -1.0000, r: -1.0000, f: -1.0000 Epoch: 5/20 Learning rate is set as: 0.012 Instance: 16000; Time: 17.02s; loss: 798.1776; acc: 364159.0/388505.0=0.9373 Instance: 32000; Time: 17.21s; loss: 794.4603; acc: 729510.0/778414.0=0.9372 Instance: 48000; Time: 17.47s; loss: 803.5468; acc: 1095145.0/1169338.0=0.9366 Instance: 48756; Time: 0.87s; loss: 35.6094; acc: 1112702.0/1188013.0=0.9366 Epoch: 5 training finished. Time: 52.57s, speed: 927.49st/s, total loss: 2431.794246196747 totalloss: 2431.794246196747 gold_num = 6062 pred_num = 551 right_num = 204 Dev: time: 12.23s, speed: 959.50st/s; acc: 0.9256, p: 0.3702, r: 0.0337, f: 0.0617 Exceed previous best f score: -1 Save current best model in file: save/lstmcrf.5.model gold_num = 0 pred_num = 255 right_num = 0 Test: time: 9.41s, speed: 649.28st/s; acc: 0.8921, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 6/20 Learning rate is set as: 0.011538461538461537 Instance: 16000; Time: 17.56s; loss: 775.2189; acc: 365355.0/389864.0=0.9371 Instance: 32000; Time: 17.50s; loss: 767.5265; acc: 731120.0/780142.0=0.9372 Instance: 48000; Time: 17.55s; loss: 750.8440; acc: 1096357.0/1169444.0=0.9375 Instance: 48756; Time: 0.87s; loss: 36.9284; acc: 1113772.0/1188013.0=0.9375 Epoch: 6 training finished. Time: 53.48s, speed: 911.75st/s, total loss: 2330.51779794693 totalloss: 2330.51779794693 gold_num = 6062 pred_num = 1264 right_num = 495 Dev: time: 12.15s, speed: 966.10st/s; acc: 0.9262, p: 0.3916, r: 0.0817, f: 0.1351 Exceed previous best f score: 0.06169665809768638 Save current best model in file: save/lstmcrf.6.model gold_num = 0 pred_num = 660 right_num = 0 Test: time: 9.36s, speed: 652.44st/s; acc: 0.8920, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 7/20 Learning rate is set as: 0.01111111111111111 Instance: 16000; Time: 17.47s; loss: 739.2744; acc: 367219.0/391165.0=0.9388 Instance: 32000; Time: 17.54s; loss: 736.4028; acc: 732631.0/780651.0=0.9385 Instance: 48000; Time: 17.62s; loss: 719.3582; acc: 1097871.0/1169616.0=0.9387 Instance: 48756; Time: 0.94s; loss: 31.8277; acc: 1115246.0/1188013.0=0.9387 Epoch: 7 training finished. Time: 53.57s, speed: 910.11st/s, total loss: 2226.863118171692 totalloss: 2226.863118171692 gold_num = 6062 pred_num = 1637 right_num = 677 Dev: time: 12.16s, speed: 965.39st/s; acc: 0.9267, p: 0.4136, r: 0.1117, f: 0.1759 Exceed previous best f score: 0.13513513513513514 Save current best model in file: save/lstmcrf.7.model gold_num = 0 pred_num = 809 right_num = 0 Test: time: 9.34s, speed: 652.97st/s; acc: 0.8920, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 8/20 Learning rate is set as: 0.010714285714285714 Instance: 16000; Time: 17.67s; loss: 715.8281; acc: 365666.0/389718.0=0.9383 Instance: 32000; Time: 17.73s; loss: 691.6875; acc: 731195.0/778518.0=0.9392 Instance: 48000; Time: 17.42s; loss: 685.3437; acc: 1098911.0/1169328.0=0.9398 Instance: 48756; Time: 0.85s; loss: 33.1987; acc: 1116500.0/1188013.0=0.9398 Epoch: 8 training finished. Time: 53.67s, speed: 908.49st/s, total loss: 2126.058032512665 totalloss: 2126.058032512665 gold_num = 6062 pred_num = 2031 right_num = 880 Dev: time: 12.18s, speed: 963.18st/s; acc: 0.9272, p: 0.4333, r: 0.1452, f: 0.2175 Exceed previous best f score: 0.17586699571372905 Save current best model in file: save/lstmcrf.8.model gold_num = 0 pred_num = 1003 right_num = 0 Test: time: 9.34s, speed: 652.51st/s; acc: 0.8913, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 9/20 Learning rate is set as: 0.010344827586206896 Instance: 16000; Time: 17.71s; loss: 672.8535; acc: 366316.0/389287.0=0.9410 Instance: 32000; Time: 17.72s; loss: 676.8826; acc: 734355.0/780665.0=0.9407 Instance: 48000; Time: 17.51s; loss: 662.7155; acc: 1101089.0/1170346.0=0.9408 Instance: 48756; Time: 0.83s; loss: 30.8140; acc: 1117695.0/1188013.0=0.9408 Epoch: 9 training finished. Time: 53.77s, speed: 906.74st/s, total loss: 2043.2654948234558 totalloss: 2043.2654948234558 gold_num = 6062 pred_num = 2227 right_num = 961 Dev: time: 12.21s, speed: 960.88st/s; acc: 0.9276, p: 0.4315, r: 0.1585, f: 0.2319 Exceed previous best f score: 0.2174718892870382 Save current best model in file: save/lstmcrf.9.model gold_num = 0 pred_num = 1078 right_num = 0 Test: time: 9.37s, speed: 650.63st/s; acc: 0.8915, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 10/20 Learning rate is set as: 0.01 Instance: 16000; Time: 17.60s; loss: 644.4109; acc: 366570.0/389148.0=0.9420 Instance: 32000; Time: 19.28s; loss: 654.0172; acc: 736675.0/782283.0=0.9417 Instance: 48000; Time: 17.72s; loss: 640.6378; acc: 1101869.0/1169984.0=0.9418 Instance: 48756; Time: 0.87s; loss: 29.2982; acc: 1118885.0/1188013.0=0.9418 Epoch: 10 training finished. Time: 55.46s, speed: 879.08st/s, total loss: 1968.3641047477722 totalloss: 1968.3641047477722 gold_num = 6062 pred_num = 2485 right_num = 1054 Dev: time: 12.26s, speed: 957.00st/s; acc: 0.9279, p: 0.4241, r: 0.1739, f: 0.2466 Exceed previous best f score: 0.23187356737845335 Save current best model in file: save/lstmcrf.10.model gold_num = 0 pred_num = 1246 right_num = 0 Test: time: 9.45s, speed: 645.41st/s; acc: 0.8915, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 11/20 Learning rate is set as: 0.009677419354838708 Instance: 16000; Time: 17.68s; loss: 631.4240; acc: 366747.0/389342.0=0.9420 Instance: 32000; Time: 17.83s; loss: 624.9867; acc: 735411.0/780235.0=0.9426 Instance: 48000; Time: 17.72s; loss: 618.2153; acc: 1102716.0/1169568.0=0.9428 Instance: 48756; Time: 0.86s; loss: 33.1386; acc: 1119982.0/1188013.0=0.9427 Epoch: 11 training finished. Time: 54.10s, speed: 901.30st/s, total loss: 1907.7646684646606 totalloss: 1907.7646684646606 gold_num = 6062 pred_num = 2754 right_num = 1130 Dev: time: 12.16s, speed: 964.89st/s; acc: 0.9280, p: 0.4103, r: 0.1864, f: 0.2564 Exceed previous best f score: 0.24663624663624664 Save current best model in file: save/lstmcrf.11.model gold_num = 0 pred_num = 1378 right_num = 0 Test: time: 9.36s, speed: 651.84st/s; acc: 0.8912, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 12/20 Learning rate is set as: 0.009375 Instance: 16000; Time: 17.53s; loss: 608.2854; acc: 366014.0/387880.0=0.9436 Instance: 32000; Time: 17.81s; loss: 610.7275; acc: 734603.0/778519.0=0.9436 Instance: 48000; Time: 17.51s; loss: 606.5329; acc: 1103944.0/1169848.0=0.9437 Instance: 48756; Time: 0.99s; loss: 26.4471; acc: 1121184.0/1188013.0=0.9437 Epoch: 12 training finished. Time: 53.84s, speed: 905.54st/s, total loss: 1851.9929056167603 totalloss: 1851.9929056167603 gold_num = 6062 pred_num = 3260 right_num = 1237 Dev: time: 12.41s, speed: 945.60st/s; acc: 0.9279, p: 0.3794, r: 0.2041, f: 0.2654 Exceed previous best f score: 0.25635208711433755 Save current best model in file: save/lstmcrf.12.model gold_num = 0 pred_num = 1662 right_num = 0 Test: time: 9.43s, speed: 646.05st/s; acc: 0.8904, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 13/20 Learning rate is set as: 0.00909090909090909 Instance: 16000; Time: 26.48s; loss: 606.1129; acc: 367862.0/389940.0=0.9434 Instance: 32000; Time: 27.47s; loss: 587.4861; acc: 736105.0/779644.0=0.9442 Instance: 48000; Time: 27.00s; loss: 583.5399; acc: 1104429.0/1169428.0=0.9444 Instance: 48756; Time: 1.35s; loss: 30.0126; acc: 1121890.0/1188013.0=0.9443 Epoch: 13 training finished. Time: 82.30s, speed: 592.42st/s, total loss: 1807.1514234542847 totalloss: 1807.1514234542847 gold_num = 6062 pred_num = 3265 right_num = 1252 Dev: time: 12.49s, speed: 939.32st/s; acc: 0.9281, p: 0.3835, r: 0.2065, f: 0.2685 Exceed previous best f score: 0.2653936923406994 Save current best model in file: save/lstmcrf.13.model gold_num = 0 pred_num = 1683 right_num = 0 Test: time: 9.53s, speed: 639.40st/s; acc: 0.8905, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 14/20 Learning rate is set as: 0.008823529411764704 Instance: 16000; Time: 26.09s; loss: 583.7969; acc: 367289.0/388783.0=0.9447 Instance: 32000; Time: 26.72s; loss: 577.0665; acc: 734686.0/777559.0=0.9449 Instance: 48000; Time: 27.46s; loss: 578.3528; acc: 1105729.0/1170053.0=0.9450 Instance: 48756; Time: 1.36s; loss: 25.0701; acc: 1122771.0/1188013.0=0.9451 Epoch: 14 training finished. Time: 81.63s, speed: 597.31st/s, total loss: 1764.2863030433655 totalloss: 1764.2863030433655 gold_num = 6062 pred_num = 3464 right_num = 1279 Dev: time: 12.74s, speed: 921.69st/s; acc: 0.9282, p: 0.3692, r: 0.2110, f: 0.2685 Exceed previous best f score: 0.2684678889246274 Save current best model in file: save/lstmcrf.14.model gold_num = 0 pred_num = 1802 right_num = 0 Test: time: 9.64s, speed: 632.44st/s; acc: 0.8906, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 15/20 Learning rate is set as: 0.008571428571428572 Instance: 16000; Time: 27.54s; loss: 573.8195; acc: 369430.0/390709.0=0.9455 Instance: 32000; Time: 18.82s; loss: 566.2060; acc: 739649.0/782191.0=0.9456 Instance: 48000; Time: 17.63s; loss: 565.3736; acc: 1106409.0/1170083.0=0.9456 Instance: 48756; Time: 0.85s; loss: 25.1248; acc: 1123386.0/1188013.0=0.9456 Epoch: 15 training finished. Time: 64.84s, speed: 751.89st/s, total loss: 1730.52388048172 totalloss: 1730.52388048172 gold_num = 6062 pred_num = 3832 right_num = 1343 Dev: time: 12.25s, speed: 957.84st/s; acc: 0.9277, p: 0.3505, r: 0.2215, f: 0.2715 Exceed previous best f score: 0.26852823850514385 Save current best model in file: save/lstmcrf.15.model gold_num = 0 pred_num = 1967 right_num = 0 Test: time: 9.38s, speed: 649.95st/s; acc: 0.8892, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 16/20 Learning rate is set as: 0.008333333333333333 Instance: 16000; Time: 17.56s; loss: 560.1864; acc: 368128.0/389236.0=0.9458 Instance: 32000; Time: 17.68s; loss: 557.8418; acc: 736706.0/778948.0=0.9458 Instance: 48000; Time: 18.00s; loss: 551.7599; acc: 1106455.0/1169463.0=0.9461 Instance: 48756; Time: 0.86s; loss: 28.0625; acc: 1123954.0/1188013.0=0.9461 Epoch: 16 training finished. Time: 54.10s, speed: 901.26st/s, total loss: 1697.8506217002869 totalloss: 1697.8506217002869 gold_num = 6062 pred_num = 3993 right_num = 1374 Dev: time: 12.23s, speed: 960.15st/s; acc: 0.9280, p: 0.3441, r: 0.2267, f: 0.2733 Exceed previous best f score: 0.2714776632302406 Save current best model in file: save/lstmcrf.16.model gold_num = 0 pred_num = 2042 right_num = 0 Test: time: 9.40s, speed: 649.94st/s; acc: 0.8894, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 17/20 Learning rate is set as: 0.008108108108108107 Instance: 16000; Time: 17.68s; loss: 542.4215; acc: 367827.0/388569.0=0.9466 Instance: 32000; Time: 17.68s; loss: 557.6564; acc: 736645.0/778488.0=0.9463 Instance: 48000; Time: 17.48s; loss: 543.5346; acc: 1106416.0/1169079.0=0.9464 Instance: 48756; Time: 0.87s; loss: 27.1097; acc: 1124370.0/1188013.0=0.9464 Epoch: 17 training finished. Time: 53.71s, speed: 907.72st/s, total loss: 1670.7221012115479 totalloss: 1670.7221012115479 gold_num = 6062 pred_num = 3697 right_num = 1338 Dev: time: 12.21s, speed: 961.22st/s; acc: 0.9284, p: 0.3619, r: 0.2207, f: 0.2742 Exceed previous best f score: 0.2732968672302337 Save current best model in file: save/lstmcrf.17.model gold_num = 0 pred_num = 1910 right_num = 0 Test: time: 9.38s, speed: 650.41st/s; acc: 0.8905, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 18/20 Learning rate is set as: 0.007894736842105263 Instance: 16000; Time: 17.54s; loss: 542.7670; acc: 368490.0/389513.0=0.9460 Instance: 32000; Time: 17.65s; loss: 539.7525; acc: 737404.0/779176.0=0.9464 Instance: 48000; Time: 17.84s; loss: 534.3007; acc: 1107366.0/1169616.0=0.9468 Instance: 48756; Time: 0.87s; loss: 26.7099; acc: 1124750.0/1188013.0=0.9467 Epoch: 18 training finished. Time: 53.90s, speed: 904.61st/s, total loss: 1643.530002117157 totalloss: 1643.530002117157 gold_num = 6062 pred_num = 4045 right_num = 1402 Dev: time: 12.17s, speed: 964.46st/s; acc: 0.9282, p: 0.3466, r: 0.2313, f: 0.2774 Exceed previous best f score: 0.27420842299415926 Save current best model in file: save/lstmcrf.18.model gold_num = 0 pred_num = 2087 right_num = 0 Test: time: 9.34s, speed: 653.48st/s; acc: 0.8897, p: 0.0000, r: -1.0000, f: -1.0000 Epoch: 19/20 Learning rate is set as: 0.007692307692307691 Instance: 16000; Time: 17.58s; loss: 536.2504; acc: 370294.0/390973.0=0.9471 Instance: 32000; Time: 17.64s; loss: 533.7698; acc: 740258.0/781634.0=0.9471 Instance: 48000; Time: 17.48s; loss: 524.8374; acc: 1107478.0/1169427.0=0.9470 Instance: 48756; Time: 0.88s; loss: 27.1169; acc: 1125028.0/1188013.0=0.9470 Epoch: 19 training finished. Time: 53.58s, speed: 909.92st/s, total loss: 1621.9745783805847 totalloss: 1621.9745783805847 gold_num = 6062 pred_num = 4151 right_num = 1432 Dev: time: 12.22s, speed: 960.65st/s; acc: 0.9284, p: 0.3450, r: 0.2362, f: 0.2804 Exceed previous best f score: 0.27743148313050364 Save current best model in file: save/lstmcrf.19.model gold_num = 0 pred_num = 2141 right_num = 0 Test: time: 9.38s, speed: 650.06st/s; acc: 0.8899, p: 0.0000, r: -1.0000, f: -1.0000
gold_num = 0
in test dataset means no entity founded in the test dataset. Please check your test data.
Besides, your learning curve seems slow, you can try more iterations.
there are same label in test data. I don't know model treats it as 0.
Generally, your label set should include label B-X
, otherwise, no entity will be recognized.
yes, I use BIO format, I can send you my data.
Sure, please send me the data for me to reproduce this problem.
@jiesutd Please check your email.
@EricAugust Your test data has an incorrect data format. As I said before, your label set should include label B-X
. While your test data has the B_X
, refine this incorrect format will solve the problem.
@jiesutd merci , thank you vary much.
@jiesutd hi, I change tag label, and try to train a ner model. But I can't get a good model. I use cnn-bilstm-crf, batch_size:16 got f1-score:0.8039. I use bert- crf , batch_size: 16, got f1-score:0.798. Emmmm, I collect all chinese label data, they may be difference domain. But, I don't know why such model can't generate a higher f1-score.
@jiesutd hi, I change tag label, and try to train a ner model. But I can't get a good model. I use cnn-bilstm-crf, batch_size:16 got f1-score:0.8039. I use bert- crf , batch_size: 16, got f1-score:0.798. Emmmm, I collect all chinese label data, they may be difference domain. But, I don't know why such model can't generate a higher f1-score.
The performance of NER depends on the data/domain. It’s hard to say how much F score is high or low unless their is baseline.
@jiesutd Is there any parameter which I can change?
You can change any parameters you want to tune your model.
Here are the main paremeters: https://github.com/jiesutd/NCRFpp/blob/master/readme/Configuration.md
my config is `
I/O
train_dir=data/example.train dev_dir=data/example.dev test_dir=data/texample.test model_dir=save/lstmcrf
word_emb_dir=data/vocab_column_300d_w2v.bin.zip
raw_dir=
decode_dir=
dset_dir=
load_model_dir=
char_emb_dir=
norm_word_emb=False norm_char_emb=False number_normalized=True seg=True word_emb_dim=300 char_emb_dim=300
NetworkConfiguration
use_crf=True use_char=True word_seq_feature=LSTM char_seq_feature=CNN
feature=[POS] emb_size=20
feature=[Cap] emb_size=20
nbest=1
TrainingSetting
status=train optimizer=SGD iteration=20 batch_size=128 ave_batch_loss=True
Hyperparameters
cnn_layer=4 char_hidden_dim=200 hidden_dim=200 dropout=0.5 lstm_layer=1 bilstm=True learning_rate=0.015 lr_decay=0.05 momentum=0 l2=1e-8 gpu=True
clip=
. and my sample data is:
在 O 此人 O 的 O 一再 O 推荐 O 下 O , O 小张 B_PER 的 O 母亲 O 换 O 了 O 一个 O 一千多元 O 的 O 燃气灶 O 。 O不过 O , O 等 O 小张 B_PER 回家 O 之后 O , O 上网 O 搜索 O 才 O 知道 O 换 O 的 O 这个 O 燃气灶 O 并 O 不值钱 O 。 O ` I change use_char=False, p,r,f still -1