Open reynoldsnlp opened 5 years ago
As for connl-u format, there does not appear to be any way to represent ambiguity, so the conversion would be lossy.
mystem
can have ambiguous readings separated by |
in its output, even with the -d
(disambiguate) flag:
$ echo "Мы уже работаем здесь три недели." | mystem3.1 -ind
Мы{мы=SPRO,мн,1-л=им}
уже{уже=ADV=}
работаем{работать=V,несов,нп=непрош,мн,изъяв,1-л}
здесь{здесь=ADVPRO=}
три{три=NUM=им|три=NUM=вин,неод}
недели{неделя=S,жен,неод=вин,мн|неделя=S,жен,неод=род,ед|неделя=S,жен,неод=им,мн}
This may not be possible in every case, but where possible, add other common output formats:
connl(x/u)
mystem
Multext-East
(Sharoff, et al.)