jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.4k stars 3.37k forks source link

support for chapter labels in EPUB output #5308

Open brainchild0 opened 5 years ago

brainchild0 commented 5 years ago

Books vary in stylistic choice for chapter titles.

Many, especially fiction, will feature the text "Chapter \<number>" preceding the title of each chapter, in some cases offsetting that text in a separately block, which may be separately styled, or using a different word following local language and convention, or other particular considerations.

Equally, numbering choices vary, with Arabic numerals common for main matter chapters but Roman styles variously preferred.

To support such variation, Pandoc could either insert the literal text into the output document, following configuration parameters, or offer style classes detailed enough to permit this effect through CSS selectors. One way or another, many authors will wish to control this styling.

Currently, the XHTML output included in the EPUB results permit some of these cases. A chapter title is represented as follows:

<h1><span class="header-section-number">1</span> The Beginning</h1>

Using the content CSS attribute, the stylesheet can cause the text "Chapter" (or anything else) to be prepended to the Arabic numeral. Further, a display setting to block can cause the resulting "Chapter 1" to appear in a separate block with desired attributes for paragraph styles or separate character styling.

This approach appears to work, but has some idiosyncrasies. One is that if the user selects the block of text, only the numeral is highlighted, with the "Chapter" label apparently not considered part of the text.

Some choices that are impossible, currently, are:

Request: Add appropriate application runtime options, or details to XHTML results, to support a greater variety of stylistic choices for chapter and other headings in EPUB results.

mb21 commented 5 years ago

I generally agree that there is potential in pandoc for adding more numbering schemes. This is however not limited to titles, and has been discussed somewhat in the context of figure and table numbers. As you see, there are lots of questions and choices to make.

Meanwhile:

Using non-Arabic numerals

That should be possible by hiding the span, and using CSS counters. See this example.

Changing the order of the number with respect to the title.

Not exactly sure what you mean, but you can always rewrite the title text using pandoc filters.

brainchild0 commented 5 years ago

This is however not limited to titles,

Absolutely. A solution should generalize to all related cases.

Using non-Arabic numerals

That should be possible by hiding the span, and using CSS counters. See this example.

I experimented with counters, and failed to make it work. If you can show a specific example of an CSS document that can be added to a Pandoc-generated EPUB file to change the heading appearing to the user from "4 Some Chapter" to "Chapter IV: Some Chapter", and that behaves consistently across viewers, then it would be very impressive to me, as I have already tried and failed to make it work anywhere.

Here are the pitfalls.

  1. Counters are a new feature in CSS, and support may be inconsistent across display environments.
  2. They are included in the EPUB profile, but clarity is lacking about whether they are scoped to the entire book, and if so, whether the expected chronology is employed. Originally, they are intended for list items within a single (X)HTML document. (And this is the big one):
  3. In general, counters along with the content CSS attribute allows the style sheet to add numbering to a sequence of blocks. However, they do not facilitate overriding the existing text. Changing the span properties to hidden will also hide the content value, meanwhile, keeping the span visible will mean that the added numeral sits next to the existing Arabic numeral. For your suggestion to work, the span text would need to empty, with the default Arabic numeral provided by the default stylesheet, not the default XHTML layout. Only then could the Arabic numeral be overridden by an added (cascaded) stylesheet. Overall, this idea may be viable, but not without a code change.

Changing the order of the number with respect to the title.

Not exactly sure what you mean, but you can always rewrite the title text using pandoc filters.

For example:

Title of Leading Chapter (1) Title of Succeeding Chapter (2) Title of Trailing Chapter (3)

Or:

flush left: Title of Leading Chapter flush right: 1 (etc)

Obviously rare, but should it not be supported? Maybe in some locales it is common. (Recall equation numbering maths typesetting.)

mb21 commented 5 years ago

Yeah, then you're left with writing a lua filter for the moment. And this issue would be a subset of #813

brainchild0 commented 5 years ago

Please note that this issue, as I frame it, encapsulates functionality that #813 does not cover in its current presentation. One observation is that #813 appears to presuppose some particular format for sequenced references (e.g. "Figure 1" not all "Fig. I"). Another issue is that a chapter heading might be typeset according to specifications more general than what would appear inside an inline reference (e.g. precede each chapter heading by page break, and a specific measurement of vertical space, setting the words "Chapter X" in a specific family, size, face, alignment, numeral type, vertical spacing, and so on, followed by a separate paragraph block containing the actual title, typeset in possibly different family, face, alignment, and so on.

To unify both issues, we must include, at least, the following.

  1. How each item is typeset in the heading or caption (e.g. large text chapter heading, bold figure labels, etc).
  2. How each item is typeset in the referring text.
  3. How each item is labelled, including numbering and text, in the heading or caption ("Chaper I", "Figure 1", etc.)
  4. How each item is labelled, including numbering and text, in the referring text (e.g. "Fig. 1").
  5. How each item in the source document is given a label for internal references.
  6. How the source document represents internal references inline to the source text.

Note that 1-4 are style issues, separate from the format of the source document, and closer to the current issue, whereas 5-6 are questions of source formatting, which I think is the core of #813.

Actually, I wondering whether one would say that this issue and #813 have much overlap, even if they might both be solved by the same comprehensive feature.