typst / hypher

Separates words into syllables.
Apache License 2.0
109 stars 9 forks source link

Variable penalty? #16

Open TheChilliPL opened 6 months ago

TheChilliPL commented 6 months ago

It would be amazing if this could support dictionaries with variable penalties for hyphenation, like in Hunspell-like dictionaries.

For example, hyph_pl_PL.dic might include entries like:

.nie8ch9że.

where . I believe mean word boundaries (as there might be more generic rulesand digits mean the penalties, thusniechżewith the minimum value of 8 might be hyphenated asnie-ch-żeand 9 asniech-że` only.

One could also implement a weighted approach, with the library trying to find the lowest penalty hyphenation along with the best alignment of the text.

laurmaedje commented 6 months ago

Typically, in TeX hyphenation patterns the digits handle the priority among the patterns (see the blog post on hypher). I don't think that this automatically implies a rating of the opportunities, so I'm not sure whether this would really make sense.

TheChilliPL commented 6 months ago

Oh, I might have actually misunderstood how the patterns work, the blog post cleared things up. However, in Affinity Publisher there is this minimum hyphenation value setting which makes words more or less likely to be hyphenated, and it uses Hunspell-like dictionaries as well 🤔