Closed yuxiazff closed 4 years ago
Hi, only the Chinese dataset is enough.
I have trainned on the corpus: UD2.6/UD_Chinese-GSDSimp, and the embedding is Word2vec which is downloaded from "https://github.com/Embedding/Chinese-Word-Vectors" trained on "People's Daily News". my trainning params are as follows:
"program": "supar/cmds/crf_dependency.py",
"args": [
"train",
"--batch-size=2000",
"--device=0",
"--feat=bert",
"-p=exp/ptb.crf.dependency.chinese.simple.bert/model",
"--bert=./bert-base-chinese",
"--embed=data/sgns.renmin.word",
"--train=data/ptb/zh_gsdsimp-ud-train.conllu",
"--dev=data/ptb/zh_gsdsimp-ud-dev.conllu",
"--test=data/ptb/zh_gsdsimp-ud-test.conllu",
I have changed the params for different models, and the result of the models are as follows:
model |100epoch-UAS | 100epoch-LAS | 200epoch-UAS | 200epoch-LAS tag+Biaffine Dependency | 86.25% | 83.70% | 86.89% | 84.45% char+Biaffine Dependency | 81.83% | 76.60% | 82.60% | 77.71% bert+Biaffine Dependency | 79.15% | 73.03% | 80.43% | 74.75% char+CRFNP Dependency | 75.77% | 70.66% | | tag+CRF Dependency | 71.57% | 69.86% | | char+CRF Dependency | 65.17% | 61.32% | | char+CRF2o Dependency | 52.06% | 48.94% | |
the "tag+Biaffine Dependency" model has the best performance, and the other models is much lowwer than it. why the result of the other models is so bad? Is there something wrong in my operation? or have i forget to do somethings?
For CRF models, please make sure non-projective trees are filtered: --proj
.
I have downloaded UD 2.3 dataset,if i want to get the result in the paper of chinese dataset of UD 2.3,should i train the model on the whole UD 2.3 dataset,or only on the chinese dataset?