Multiword Tokens - Githubissues

Hello,

we're attempting to recreate some of the results from CoNLL2017 and are having trouble with the output of the parser, specifically that it doesn't handle multiword tokens in the way the Shared Task seems to require.

An example in german would be

raw text: "Zum" Gold: 1-2 Zum 1 Zu 2 dem Parser output: 1 zu 2 dem

The Parser output is missing the multiword token 1-2 Zum and the evaluation script breaks off since the number of tokens between input and gold now differ. Since the parser scored on the task, I'm assuming there's a configuration option somewhere that enables the multiword token output and is disabled by default, but I can't find it.

Thank you for your help

tdozat / Parser-v2

Multiword Tokens #16