paddymcall / SARIT-pdf-conversions

XML to PDF for SARIT texts
https://github.com/sarit/SARIT-corpus
2 stars 1 forks source link

Fix hyphenation (and overfull hboxes) #14

Closed ppasedach closed 8 years ago

ppasedach commented 8 years ago

I got the impression that the discretionaries (\-) do more harm than good, as they disable the automatic hyphenation of a string by latex at any other point than the ones specified. I've done an automatic search and replace to remove them, see attachment, and the result looks, in my opinion, better. So if we do use discretionaries in a word we should use them at all reasonable places. Now looking at the XML-File and the stylesheet I was surprised to see practically no hyphenation in the former and a function in the latter introducing (among other things) hyphenation after every long vowel, with a "should be fixed" in its description. Now I would suggest to generally leave the hyphenation up to polyglossia, which, by the rules defined in hyph-sa.tex already does quite a good job. Maybe adjust the minimum of characters that can be hyphenated to 2,

\PolyglossiaSetup{sanskrit}{
  hyphenmins={2,3},% default is {1,3}
}

but that's a matter of taste. One problem at least remains though: strings of more than 63 characters, after which tex's hyphenation ceases to function. Odd but interesting problem, that in sanskrit can occur. My experiments with manual discretionaries here didn't give satisfactory results, I suppose as we have no stretchable inter-word-space, possibly microtype could help here, but that would mean to switch to lualatex or pdflatex. Or just live with the imperfection and when absolutely required manually adjust the letter spacing of the concerned string in the tex file.

ratnakirti-nibandhavali_discretionaries_removed.pdf ratnakirti-nibandhavali.pdf