ufal / udpipe

UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
Mozilla Public License 2.0
358 stars 75 forks source link

Is the parser using computed tags during training? #17

Closed erikve closed 7 years ago

erikve commented 7 years ago

Thanks for a great tool!

I have a question regarding the source of PoS tags during parser training with UDPipe. Running udpipe --accuracy --tag --parse reports LAS/UAS both for using computed and gold tags during testing, but it is not clear to me whether the parser uses computed or gold tags during training.

I had always assumed that the parser gets trained on gold tags, until I recently tried training the parser on data with computed tags (while re-using a tagger from an existing model with the option --tagger=from_model=.. and also re-using the same pre-computed form embeddings) and the results became identical to when training on the original data with gold tags.

To try to shed more light on this issue, I then tried to train yet another parser on the original gold-tag data, specifying --tagger=none. Running udpipe --accuracy --parse with this model gives different LAS/UAS scores (as reported for `Parsing from gold tokenization with gold tags') than when the UDPipe model also contains a tagger.

In sum this seems to indicate that computed tags are used during parser training. Any clarification on this issue would be very welcome.

foxik commented 7 years ago

Sorry that the documentation is a bit lacking :-(

Yes, when a tagger is defined, the computed tags are used to train the parser -- according to measurements, this tends to yield better resulting parser (the computed tags are generated on training data, so they are still heavily overfitted, but [during my measurements] resulted in better resulting parsing accuracy).

There exists a parsing option use_gold_tags=1, which forces usage of gold POS tags. I will document it shortly (leaving this open until I do).

erikve commented 7 years ago

Thanks a lot for the quick response!

That's been my experience as well; that training on computed tags gives better results. Hence I was trying to accomplish what UDPipe already does by default then. Thanks for the clarification.

Btw; I assume the same holds for the argument of --heldout= when this is used for tuning the parser, i.e. this validation or test set too uses computed tags from the tagger component?

foxik commented 7 years ago

Btw; I assume the same holds for the argument of --heldout= when this is used for tuning the parser, i.e. this validation or test set too uses computed tags from the tagger component?

Yes, the --heldout data are processed the same way as training data -- if a tagger is being trained (and use_gold_tags is zero, which is default), heldout data is also POS tagged by the trained tagger before training the parser.

As for the accuracy measurements (--accuracy without --train), all possible results are measured (i.e., both gold POS tags and computed POS tags are evaluated during parsing, if a POS tagger is used)

foxik commented 7 years ago

Documented in UDPipe 1.1.