Closed GabrielLin closed 6 years ago
I think ,is not seen in the training data (, is used instead) and therefore treated as an OOV and tagged incorrectly. You can substitute all , with ,and retrain the model and see if it works better.
If I feed the network with training data with both ',' and ',' , is there any impact on the performance?
I don't think so. That actually might be a good idea!
I trained the model with
python tagger.py train -p ud1 -t train.txt -d dev.txt -wv -cp -rd -gru -m model_ud1 -emb Embeddings/glove.txt
and test it with
The result are:
It seems that all the ',' is tagged as NUM .