nuwainfo / tibetaneditor

Working repository for a simple Tibetan editor with a segmenter, spellchecker, rule editor, concordancer and more.
2 stars 3 forks source link

Tibetan improperly processed #10

Closed thedirk closed 6 years ago

thedirk commented 6 years ago

Words that end in a 'shad' aren't processed. Words that end in a 'space' character aren't processed.

thedirk commented 6 years ago

screenshot 45 here, you can see "dang" is in my list as a "red" word. but without a 'tsheg', it doesn't get highlighted.

Words without punctuation (e.g., ‘dang’ in the screenshot above) don’t get processed — maybe not an issue for most writers, but if i’m used to inserting spaces as an author, I’d expect the software to process a text (properly) even if I’ve already word-spaced it.

thedirk commented 6 years ago

screenshot 46 another example: words without 'tsheg' endings don't get processed as words.... here, བཀྲ་ཤིས is only processed as a "noun" when it has a final 'tsheg'.

expected behavior: whether it's due to word-spacing, an unconventional usage, or even author error, i'd expect such a word to be processed

thedirk commented 6 years ago

screenshot 47

Another example. "yin" ending with a 'shad' doesn't get processed; with a 'tsheg', it does get processed. same w/ "bde legs"