electricbookworks / electric-book

A Jekyll template for creating books in multiple formats
https://electricbookworks.github.io/electric-book
GNU General Public License v3.0
120 stars 45 forks source link

Automate solving of widows and orphans #162

Open arthurattwell opened 7 years ago

arthurattwell commented 7 years ago

When making books, we spend most of our time on print output fixing widows and orphans. I suspect we could automate most, if not all, of this work.

I'm guessing that this would mean using Javascript with Prince, and/or using vivliostyle.js for formatting instead of Prince. (Vivliostyle doesn't yet support as much CSS Paged Media and some other cool stuff that Prince does, but it's getting there and is open.)

'Widows' and 'orphans' refer to three problems in typography:

  1. A single line at the bottom of a page (widow)
  2. A single line at the top of a page (orphan)
  3. A very short line (e.g. 5 or fewer characters) at the end of a paragraph (usually just called a short-line).

Good designers will adjust letter-spacing (tracking or kerning in software like InDesign) in various paragraphs to prevent these, usually tightening or loosening spacing by up to 10/1000 em. Where it's impossible to avoid all three problems, the designer finds the least-bad layout. This is in this order from best to worst outcome:

  1. No widows, orphans or short lines.
  2. One or more short lines.
  3. A widow on a left-hand page.
  4. A widow on a right-hand page.
  5. An orphan, wider than half the text area, on a right-hand page.
  6. An orphan, wider than half the text area, on a left-hand page.
  7. An orphan, less than half the text area, on a right-hand page.
  8. An orphan, less than half the text area, on a left-hand page.

And then, give a choice, one might choose two widows on right-hand pages (not great) over one short-line orphan on a left-hand page (much worse). If we assigned a score to each possible problem, we could make this choice mathematically, possibly weighting the worst problems.

To resolve any given problem, a designer has to consider the impact of their tightening or loosening over a number of pages, sometimes moving forward and back over an entire document.

This design process is so universal and formulaic that I believe it must be possible to automate it, if a script could detect each problem, and iteratively apply tightening or loosening classes to paragraphs to find the best possible outcome.

Incidentally, in our classic theme currently, we use tightening and loosening classes: e.g. p.tighten-5 tightens a paragraph by 5/1000 em. In the Sass, we set standard letter-spacing for the entire document, defined as $letter-spacing-text. We then generate classes for tightening and loosening:

$letter-spacing-text: 0.01em; // Default letter-spacing for p, ul, ol, dl. Set in ems, e.g. 0.01em for 10/1000s of an em.
$highlight-tightened: inherit; // set color for debugging
$highlight-loosened: inherit; // set color for debugging
$edition-suffix: ""; // for applying to different editions of the same content

// These classes control letter-spacing (tracking), usually to save widows and orphans.
@for $i from 1 through 100 {
  $add-space: $i * 0.001em;
  .tighten-#{$i}#{$edition-suffix} {
    letter-spacing: $letter-spacing-text - $add-space;
    font-style: inherit;
    background-color: $highlight-tightened;
  }
  .loosen-#{$i}#{$edition-suffix} {
    letter-spacing: $letter-spacing-text + $add-space;
    font-style: inherit;
    background-color: $highlight-loosened;
  }
}

So a script might insert these classes to use existing CSS.

Or we might take a different approach entirely.

arthurattwell commented 7 years ago

Jeremy Keith uses Javascript to pop a non-breaking space between the last two words in a paragraph to solve widowed-words. Neat. May cause problems with really long words, but it might be quicker to override those with some kind of tag than to solve each one manually.

arthurattwell commented 6 years ago

@SteveBarnett wrote up his rough notes on poking at the Prince box-tracking API here and saved some bookmarks.

tgraham-antenna commented 3 years ago

FWIW, AH Formatter has an automated analysis feature that can report on:

See https://www.antenna.co.jp/AHF/help/en/ahf-analyzer.html

Unfortunately, solving these things is much harder than finding them.