Open Oneplus opened 8 years ago
准确率
dataset | BILSTM |
---|---|
pku-weibo-train | 98.87% |
pku-weibo-holdout | 96.3366 % |
pku-weibo-test | 96.3295 % |
pku-holdout | 97.9482 % |
pku-test | 97.9871 % |
weibo-holdout | 93.1476 % |
weibo-test | 93.3387 % |
速度
-
准确率:
dataset | pre_tag |
---|---|
pku-weibo-train | 98.4855% |
pku-weibo-holdout | 96.2750 % |
pku-weibo-test | 96.2261 % |
pku-holdout | 97.9798 % |
pku-test | 98.0088 % |
weibo-holdout | 92.9018 % |
weibo-test | 93.0096 % |
速度:
14k/s (在本模型下,速度均在此值附近,故后面不再赘述)
##
dataset | trans_number + pre_tag |
---|---|
pku-weibo-train | 98.4776% |
pku-weibo-holdout | 96.4691 % |
pku-weibo-test | 96.3994 % |
pku-holdout | 98.1031 % |
pku-test | 98.0442 % |
weibo-holdout | 93.4316 % |
weibo-test | 93.2359 % |
dataset | trans_number + pre_tag + UNK_replace |
---|---|
pku-weibo-train | 98.5275 % |
pku-weibo-holdou | 96.4633 % |
pku-weibo-test | 96.4428 % |
pku-holdout | 98.1119 % |
pku-test | 98.1346 % |
weibo-holdout | 93.2013 % |
weibo-test | 93.3903 % |
- 结果已更新,之前代码有BUG
gigawords
and sogou-news
embedding as fixed(pre-trained using word2vec) word embedding , dimension : 50 , vocabulary size : 300K and 1.1Mdataset | Double Channel (gigawords-skipgram) | Double Channel(sogou-skipgram) |
---|---|---|
pku-weibo-train | 97.75% | 97.87% |
pku-weibo-holdout | 96.02% | 96.11% |
pku-weibo-test | 96.02% | 96.15% |
pku-holdout | 97.81% | 97.85% |
pku-test | 97.85% | 97.92% |
weibo-holdout | 92.49% | 92.68% |
weibo-test | 92.74% | 92.96% |
dataset | bi-LSTM CRF |
---|---|
pku-weibo-train | |
pku-weibo-holdout | 96.46% |
pku-weibo-test | 96.43% |
pku-holdout | 98.09% |
pku-test | 98.10% |
weibo-holdout | 93.22% |
weibo-test | 93.43% |
与使用 pretag + unk_replace + trans_num (模型二,实验三)相比,在WEIBO数据上有小幅提升(+0.02% , +0.03%),但在PKU上有小幅下降(0.02% , 0.03%) , Merge数据集上稍差(0 , -0.01%)。
约 4.4 K tokens/s
F(x)
of the figure) embedding dim : 50 , fixed word(x
of the figure) embedding dim : 50 , bi-LSTM hidden dim 100 , CRF hidden dim 32dataset | CRF-dc-giga-h32-skipgram | CRF-dc-sogou-h32-skipgram |
---|---|---|
pku-weibo-train | 97.78% | 97.67% |
pku-weibo-holdout | 96.11% | 96.12% |
pku-weibo-test | 96.11% | 96.07% |
pku-holdout | 97.82% | 97.77% |
pku-test | 97.87% | 97.74% |
weibo-holdout | 92.73% | 92.86% |
weibo-test | 92.93% | 93.07% |
with gigawords-embedding : 约 4 K tokens/s
with sogou-embedding : 约 3.4 K tokens/s
Mainly including prefix and suffix infomation of the word . It's copy from LTP POSTAGGER .
Example :
word = "篮球"
handcraft features = [ "p1=篮", "p2=篮球", "p3=",
"s1=球", "s2=篮球", "s3=",
"2" ]
模型 | pku-weibo-holdout | pku-weibo-test | pku-holdout | pku-test | weibo-holdout | weibo-test | pku-weibo-train | epoch | 速度 |
---|---|---|---|---|---|---|---|---|---|
F2I RNN | 95.42% | 95.49% | 97.22% | 97.24% | 91.87% | 92.33% | 96.99% | 15(15) | 13.5688 K |
F2I GRU | 95.96% | 96.01% | 97.63% | 97.72% | 92.65% | 92.92% | 97.45% | 10(15) | 12.8677 K |
F2I LSTM | 96.45% | 96.45% | 97.99% | 97.98% | 93.41% | 93.71% | 97.89% | 14(15) | 10.066 K |
模型 | pku-weibo-holdout | pku-weibo-test | pku-holdout | pku-test | weibo-holdout | weibo-test | pku-weibo-train | epoch | 速度 |
---|---|---|---|---|---|---|---|---|---|
F2O RNN | Failed for gradient error | - | - | - | - | - | - | - | - |
F2O GRU | 96.71% | 96.59% | 98.20% | 98.13% | 93.76% | 93.82% | 98.48% | 8(15) | 6.34398 K |
F2O LSTM | 96.80% | 96.80% | 98.27% | 98.27% | 93.90% | 94.14% | 98.64% | 9(15) | 10.3772 K |
PS : Speed
may be meaningless because of different cpu load
the same as previous
隐层参数 | pku-weibo-holdout | pku-weibo-test | pku-holdout | pku-test | weibo-holdout | weibo-test | pku-weibo-train | 速度 | epoch-of-best |
---|---|---|---|---|---|---|---|---|---|
50 | 96.38% | 96.45% | 97.99% | 98.03% | 93.21% | 93.59% | 98.17% | 36.418 K | 13 |
100 | 96.44% | 96.41% | 98.03% | 98.02% | 93.29% | 93.52% | 98.14% | 23.0466 K | 9 |
200 | 96.46% | 96.50% | 98.02% | 98.05% | 93.37% | 93.71% | 98.43% | 15.6605 K | 14 |
300 | 96.41% | 96.40% | 97.97% | 97.96% | 93.33% | 93.60% | 98.35% | 9.93645 K | 12 |
100,32 | 96.24% | 96.19% | 97.84% | 97.87% | 93.06% | 93.16% | 97.85% | 31.3846 K | 14 |
100,100,32 | 96.12% | 96.11% | 97.76% | 97.75% | 92.88% | 93.16% | 97.70% | 25.6499 K | 14 |
隐层参数 | pku-weibo-holdout | pku-weibo-test | pku-holdout | pku-test | weibo-holdout | weibo-test | pku-weibo-train | 速度 | epoch-of-best |
---|---|---|---|---|---|---|---|---|---|
100 | 95.93% | 95.87% | 97.65% | 97.64% | 92.55% | 92.66% | 98.11% | 54.3706 K | 14 |
200 | 95.97% | 95.96% | 97.70% | 97.72% | 92.53% | 92.79% | 98.29% | 33.4226 K | 15 |
300 | 95.92% | 95.88% | 97.65% | 97.65% | 92.50% | 92.70% | 98.12% | 20.9799 K | 9 |
100,32 | 95.75% | 95.70% | 97.58% | 97.52% | 92.13% | 92.43% | 97.79% | 50.7936 K | 14 |
100,100,32 | 95.53% | 95.52% | 97.40% | 97.41% | 91.83% | 92.09% | 97.55% | 38.0326 K | 15 |
PS: Speed
may be meaningless because of different cpu load
用这个issue记录实验过程。