tapeinosyne / hyphenation

Text hyphenation for Rust
Apache License 2.0
53 stars 12 forks source link

Missing opportunities / producing too few syllables #37

Closed piegamesde closed 1 year ago

piegamesde commented 1 year ago

Some examples (I can give a lot more if you want to)

problematic -> prob-lem-atic
combustibleness -> com-bustible-ness

I don't need the algorithm to be perfect by any means, but "atic" and "bustible" are some pretty weird syllables. I'd have to properly measure but I'd say the error rate is about 5% on average, and up to 15% for longer and less common words.

piegamesde commented 1 year ago

TIL that hyphenation and syllabification are slightly distinct problems, and that the former is conservative by design to have a low false-positive rate.