electricbookworks / electric-book

A Jekyll template for creating books in multiple formats
https://electricbookworks.github.io/electric-book
GNU General Public License v3.0
117 stars 45 forks source link

Using word-break: break-all to wrap URLs in PDFs? #630

Closed jaycolmvar closed 2 years ago

jaycolmvar commented 2 years ago

The screen-pdf output for my book is looking pretty good (thanks, EBW!), but there's been one thing bothering me:

I have source citations in endnotes and the bibliography that include long URLs that don't have conveniently placed '/' or '-' characters. When formatted those notes and bibliography entries end up with just a few words to a line with lots of space between the words, due to not being able to break the URL at a '/' or '-'. Something like the following (if I can get the Markdown to work):

a          few          words         and         then         a          long              URL:          https://example.com/
areallyreallylongurlthathasnoslashcharactersorhyphenswherealinewrapcanbemade.

After doing some googling on this, I decided to modify _sass/template/partials/_print-base-typography.scss and change the CSS properties on the anchor object ('a') to use "word-break: break-all" instead of the previous "word-break: break-word".

This appears to do exactly what I want. I get output like

a few words and then a long URL: https://example.com/areallyreallylongurlthathasnoslashcharactersor
hyphenswherealinewrapcanbemade.

(which I think looks much better) and the links still work when you click on them.

Are you all aware of any downsides to making this change?

jaycolmvar commented 2 years ago

Following up my comment above: Well, I found one downside to using "word-break: break-all". I have at least one case where the URL line-breaks right after the initial 'h' in 'https', which looks bad, and another case where the last character of the URL is on a line by itself, which looks just as bad. In the first case I forced a line break right before the 'h'. In the second case I rewrote the preceding text to make the URL line-break better. I don't know of any better way to handle these cases.

jaycolmvar commented 2 years ago

OK, I've done some more investigation, and I think that the existing code specifying "word-break: word-wrap" is best left as is. If you need to force a line break in a long URL, the better approach is to insert a zero-width space (U+200B or ​ in HTML). With "word-break: word-wrap" Prince XML will break at a slash ('/') and hyphen ('-'). Chicago Manual of Style also recommends breaking before periods, e.g., in long domain names. Prince XML does not do that, but you can achieve the same effect by converting a URL source citation from

<https://really.really.long.domain.name.example.com>.

to

[https://really&#x200B;.really&#x200B;.long&#x200B;.domain&#x200B;.name&#x200B;.example&#x200B;.com](https://really.really.long.domain.name.example.com)

This causes Prince XML to link break in the middle of the domain name if needed, but still preserves the actual URL being linked to as is so you can click on the link.

Note that the Prince support forum has a port where someone (a Prince employee) recommends using span.url to do this automatically: https://www.princexml.com/forum/topic/1121/support-for-css3-word-wrap

I tried this but couldn't get it to work, but I'm not expert in the way the EB templates work.

jaycolmvar commented 2 years ago

I'm closing this issue because I don't think a change is warranted as this time. The workaround of using a zero-width space (&#x200b;) works fine, and allows both for compliance with CMOS style as well as more fine-grained control if needed.

arthurattwell commented 2 years ago

Thanks, @jaycolmvar, this is a valuable thread for anyone solving similar issues in future.