gchrupala / morfette

Supervised learning of morphology
BSD 2-Clause "Simplified" License
28 stars 5 forks source link

Output for cyrillic is broken #28

Open bgospodinov opened 6 years ago

bgospodinov commented 6 years ago

Even though I am passing UTF-8 encoded input w/o BOM this is what I am getting as an output:

╨й╨╛╨╝ ╤й╨╛╨╝ Cs ╤Б╨╡ ╤б╨╡ Ppxta ╨╜╨░╤П╨╝ ╨╜╨░╤п╨╝ Vpiif-r1s , , punct ╤Б╤В╨░╨▓╨░╨╝ ╤б╤в╨░╨▓╨░╨╝ Vpitf-r1s . . punct