Closed parksjin01 closed 5 years ago
Hi! Thanks for your interest in Sequence Labeling Parsing.
First of all, you should only use training and development set during training, never test set, since the latter should never be shown and be evaluated on during training (I'm referring to this line dev_gold=nkor/test.conll). Maybe if you change it to the file with dev set it will give you the correct accuracy.
In line 486 all labels from the file you want to evaluate your model on, are set to 0. The point of that is to assure that a model is predicting those labels.
Hi, I'm student who studies NLP.
I trying to use dep2label model for dependency parsing task and I have some questions about result.
First, UAS of dev dataset is too high. When using
Relative PoS-based encoding
, UAS shows about 99% with 200 iterations. I think it's too high even if it is dev datasetSecond, after training the model, when I use trained model to check test dataset accuracy
UAS shows under 40%.
I can't infer the reason why that kinds of things happen.
Third, in
main.py
line 486 encoding is fixed as 0 but in line 498 decoding is set as 3.I have a question that if encoding and decoding method is different, it doesn't affect to performance?
Below is the config file for training and decode.
train.config
decode.config
decode_best_model.config
File format
above 3 file contains arc, label information