Fused words in Universal Dependencies

I agree we need a better (less wild) API for fused (aka multi-word) tokens in Treex.

I am not sure how it will solve the problem in KonText, which probably can display either only tokens or only words. There are scripts distributed with UD (e.g. conllu-w2t.py) for converting the CoNLL-U word-indexed format to other formats.

See also http://universaldependencies.github.io/docs/cs/overview/tokenization.html http://universaldependencies.github.io/docs/u/overview/tokenization.html http://universaldependencies.github.io/docs/format.html#words-and-tokens

ufal / treex

Fused words in Universal Dependencies #17