ufal / udpipe

UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
Mozilla Public License 2.0
359 stars 75 forks source link

Does the “joint model” mean joint probability? #84

Open msklvsk opened 5 years ago

msklvsk commented 5 years ago

Does UDPipe-Future predict morphology first, and than use that morphology as an input to syntax or does it consider joint probabilities of such syntax over such morpho? There should be cases of probable morphology which leads to improbable syntax.

foxik commented 5 years ago

In the shared task we tried two approaches, one was to do morphology first and then use it as input to syntax. The other possibility was to compute shared representations for both morphology and syntax and then predict both independently. Generally both approaches seem to result in very similar performance.

msklvsk commented 5 years ago

While investigating morpho errors of UDPipe 1.2 for Ukrainian, I found that ~20% are cases where the tagger selected more popular interpretation instead of rare but correct one. If it considered the unlikeliness of a parse such popular interp will produce, it may have made the correct decision. But it must be different with UDPipe-Future. Looking forward to test it!