heiglandreas / Org_Heigl_Hyphenator

Provide TeX-Hyphenation to PHP
http://orgheiglhyphenator.readthedocs.org
MIT License
54 stars 14 forks source link

English hyphenation results #51

Open joseflorido opened 4 years ago

joseflorido commented 4 years ago

Hi!

I tried this code:

use \Org\Heigl\Hyphenator as h; $hyphenator = h\Hyphenator::factory(); echo $hyphenator->hyphenate('hyphenation'); // hy-phe-na-ti-on echo $hyphenator->hyphenate('chocolate'); // choco-late

Expected results are: hy-phen-a-tion choc-o-late

My config is:

noHyphenateString = null hyphen = "-" leftMin = 1 rightMin = 1 wordMin = 3 quality = 9 customHyphen = "==" defaultLocale = "en_US" tokenizers = "Whitespace,Punctuation" filters = "Simple,CustomMarkup"

Any idea why I am seeing different results?

Thanks! Jose

heiglandreas commented 3 years ago

Hey @joseflorido - Sorry for the late response. It looks like the base of the Hyphenation patterns that this library uses – the American English hyphenation patterns for OpenOffice.org – do not contain patterns that allow the hyphenation that you expect.

As there are other (partly pretty expensive) hyphenation algorithms available it might happen, that other websites propose other hyphenations.

I'm currently though checking whether there is a newer dictionary file available that perhaps matches your expectations as well.

heiglandreas commented 3 years ago

Until then you can add your own hyphenation patterns as described in https://github.com/heiglandreas/Org_Heigl_Hyphenator/issues/49#issuecomment-650567589