Kozea / Pyphen

Hy-phen-ation made easy
https://courtbouillon.org/pyphen
Other
198 stars 24 forks source link

User-specified hyphenation dictionaries #55

Open Gayo opened 1 year ago

Gayo commented 1 year ago

It looks as though Pyphen has full support for arbitrary user-specified hyphenation dictionaries via its filename parameter, but Weasyprint isn't presently making use of that capability. It would be nice to have a commandline switch to specify this option, like with stylesheets, in case people want to do something a bit unusual. At present the only way to tweak hyphenation policy at this level is to maually tweak the contents of the Pyphen package, which is a bit dicey.

liZe commented 1 year ago

Hi!

Thanks for this issue! We should actually add a public API to register a new Pyphen dictionary. It’s already possible but not documented yet:

from pathlib import Path

import pyphen
from weasyprint import HTML

pyphen.LANGUAGES['xx'] = Path('/xxx/hyph_xx.dic')
pyphen.LANGUAGES_LOWERCASE['xx'] = 'xx'

html = '''
<html lang="xx">
...
</html>
'''

HTML(string=html).write_pdf('/tmp/output.pdf')

I’d like to add something like pyphen.register('xx', Path('/xxx/hyph_xx.dic')). Would that be OK for you?

Gayo commented 1 year ago

That'd be perfect as long as there's a way to specify the preference through the commandline API, whether directly or via a config file. Currently I'm using Weasyprint as the second step of a pandoc conversion so I favour commandline-based solutions, but if it seems too fussy, exposing it only through the public Python API would still be very useful.

liZe commented 1 year ago

That'd be perfect as long as there's a way to specify the preference through the commandline API, whether directly or via a config file. Currently I'm using Weasyprint as the second step of a pandoc conversion so I favour commandline-based solutions, but if it seems too fussy, exposing it only through the public Python API would still be very useful.

I think that it’s too specific to be available as a command-line option. But having the feature in Pyphen’s API is a good idea, let’s move this issue over there!