Closed jwhitham closed 11 months ago
Thanks for the detailed bug reports, @jwhitham.
Unfortunately, nowadays I'm unlikely to spend much free time on rinohtype. Since you are using rinohtype in a commercial setting, and assuming it is providing value to your company, there are some options for getting these issues fixed in a timely manner:
Both options would help to keep the project sustainable.
Thanks. I'm grateful for your support. I have created a simple pull request for issue 415.
For this issue, I did write a possible fix, but I'm not happy with the code quality, and I think for now I'd rather deal with this problem using the workaround of inserting zero-width spaces, which seems to work fairly well for the documents I have done so far.
I will ask about the possibility of sponsoring your project within the company.
The current master branch will now automatically split "words" at slashes and it also fixes hyphenation of the first word on a line. See for example hyphenation.pdf.
There would be benefit in handling splitting separately for paths, URLs and regular text, but that requires semantic information.
:file:
role, however.
Is there an existing issue for this?
PDF produced by rinohtype
target.pdf
On page 5 there are two long lines. The first line consists of a single long word ("/Example/Demo/Example...") which overflows the right-hand side of the page.
The second line consists of a short word and the same long word (i.e. "word" then "/Example/Demo/Example..."). In this case, the second word is split onto two lines with a hyphen.
Expected behavior: both of these long words ought to be split onto multiple lines with hyphens
Actual behavior: if the long word is the first word in a line, then a long word is not hyphenated.
I think that this is the same issue reported in https://github.com/brechtm/rinohtype/issues/188 . Adding zero-width spaces to the text will avoid the problem (though it introduces another problem, see https://github.com/brechtm/rinohtype/issues/415 ). However, as the long word can be hyphenated, it would be better if it could just be hyphenated - regardless of whether it is the first word, second word, or any other word.
The problem also occurs if all or part of a long word becomes the first word in a line as a result of earlier overflows. The last line on page 5 has this problem: notice that the word spills into the right margin. It ought to be hyphenated again, splitting over three lines, but it is not.
The problem can happen to short words too. When a very long word is in the second column of a table, the first column may be "squeezed", becoming so narrow that even a relatively short word needs to be hyphenated. However, if that word is the first word on the line, it can't be hyphenated, so it overflows into the second column.
I think https://github.com/brechtm/rinohtype/blob/b7be22f68c78fe3e21de39949d51c7b474d1ac1a/src/rinoh/paragraph.py#L1104 is possibly the place which introduces different behavior for the first word in a line. Is there any way that such a word could be hyphenated?
Source files
no-hyphenation-for-first-word.zip
The bug can be reproduced by running "demo.bat".
Versions