ways2read / InDesignA11Y

Issue tracking improvements in InDesign for better accessible EPUB export
9 stars 0 forks source link

Page Title #4

Open gregoriopellegrino opened 2 years ago

gregoriopellegrino commented 2 years ago

Knowledge base reference: http://kb.daisy.org/publishing/docs/html/title.html

Issue

InDesign's export to EPUB 3 puts a placeholder in the <title> tag (with the file name plus a sequential number). This is a problem because people using assistive technologies often use the information contained in the <title> tag to know where it is located.

<title>0-source</title>

How to fix it

Instead of the placeholder indicate the most important title in the HTML file, along with the title of the publication (from metadata).

<title>Chapter One - Title of the Pubblication</title>
LauraB7 commented 1 year ago

User story: "As a user important orientation information is displayed as I move through the publication."

Racmathu commented 4 months ago

@LauraB7 and @gregoriopellegrino , is this requirement only for Fixed Layout?

gregoriopellegrino commented 4 months ago

Dear @Racmathu , this is both for Fixed Layout and Reflowable.

Racmathu commented 4 months ago

@gregoriopellegrino Thanks for the quick reply. Would like to get more clarity about the requirements:

LauraB7 commented 4 months ago

Hi @Racmathu. The group has discussed this issue and here is our proposal.

This isn't perfect as there may well be situations where there is neither a header or semantics. And semantics (epub:type) are tricky to apply to threaded content.

Whatever is put in the <title> will need to be localized.

gregoriopellegrino commented 2 months ago

Dear @raman211, to prepare ourselves for the call scheduled on Monday, June 17th: are there any new updates or releases related to this issue?

Currently, we are running the InDesign Pre-Release 19.5 (Mainline CI #f95bd09) version on our machines.

Thanks

NawneetG commented 2 months ago

Dear @gregoriopellegrino @ways2read @LauraB7 We have dropped latest PR 19.5.0.58 build. Please test the feature and give your valuable feedback. We will be waiting to hear from you. @Racmathu @raman211

jonaslil commented 2 months ago

We have tested the page title feature and noticed that mostly, the text of the first h# element in the file is used as page title. We like the solution and it works well in many cases. There are some issues, however, which we can discuss in tomorrow's meeting. Our main points:

Fallback mechanism

At present, if there is no heading, Indesign uses the XHTML file name as title. We suggest using the following fallback chain:

  1. First heading found in content
  2. First five words in file content, followed by a space and an ellipsis
  3. Document title from Indesign file info metadata
  4. XHTML filename without file extension (= current fallback)

Include the entire heading text

NawneetG commented 2 months ago

Thanks @jonaslil for your valuable feedback. We have acknowledged it and will discuss further in tomorrow's meeting. @Racmathu @raman211

gregoriopellegrino commented 2 months ago

Dear @Racmathu @raman211 @NawneetG, we have prepared an InDesign test file to test the fallback mechanism proposed in the comment https://github.com/ways2read/InDesignA11Y/issues/4#issuecomment-2173745070

Test file: test-title.indd.zip

It is a kind of unit-test file: on each page a different case is presented and the output we expect within the <title> tag is indicated.

Note: in the EPUB reflowable export we need to indicate to split the EPUB file according to the paragraph styles settings.

We remain available for any questions.

NawneetG commented 2 months ago

Dear @gregoriopellegrino @ways2read @LauraB7 We have dropped latest PR 19.5.0.77 build. We have incorporated the suggestions and fallback mechanism. One thing we have added is instead of 5 characters for title from paragraph, we will be taking as 50 characters. We will be waiting to hear from you. @Racmathu @raman211

raman211 commented 2 months ago

@jonaslil Could you please help us, validating the Fallback mechanism suggested above https://github.com/ways2read/InDesignA11Y/issues/4#issuecomment-2173745070

jonaslil commented 2 months ago

@raman211 , here's our feedback. This is mostly based on testing in 19.5.0.77, and also partly in 19.5.0.83.

Using the first h# in the content file as <title> works reliable in our tests. The heading text is not truncated and span tags in the h# do not cause problems anymore. Taking the first 50 characters of the document text (fallback 2) works fine, at least in the languages we do our testing in.

We did notice three remaining issues:

  1. A <br> in the heading is replaced by a line separator character (U+2028) in the <title>, not a space (U+0020) as it should be.
  2. Fallback 3, using the document title from Indesign file metadata when no text is present in the XHTML file, does not work. Indesign inserts an object replacement character (U+FFFC) as the only content of the <title>. (See the last xhtml file in test-title.epub below.)
  3. Fallbacks for content files without headings are not working properly when there is a heading in a later XHTML file. In this case, the heading found later in the epub is used as <title> also for the preceding files.

Issues 1 and 2 can be seen in this epub, created from the test file we submitted earlier. test-title.epub.zip

To demonstrate issue 3, we created a modified test file, with an additional page with a heading at the end. When this file is exported to epub, the last three XHTML files have the same title, based on the h2 heading in the last file. test-title-0.2.indd.zip test-title-0.2.epub.zip

raman211 commented 2 months ago

@jonaslil We were not able to accommodate the fixes for the issues reported https://github.com/ways2read/InDesignA11Y/issues/4#issuecomment-2209100568, due to the timelines. Except that other changes are available in the ID19.5 release builds. We are planning to submit the changes to the MAX release.

LauraB7 commented 2 months ago

Hello @raman211. Aside from the lack of a <title> element when the HTML file contains no text, there is another issue. Both the cover.xhtml and toc.xhtml files need better <title>s.

Our proposed solution is to avoid using the title of the book in the <title> field of those two files instead using an internationalized version of "Cover" for the cover.xhtml, and "Contents" for the navigation file. The latter solution will default to whatever the user designates as the navigation title in TOC Styles. To summarize:

  1. If the user has set to use a TOC style for the Navigation TOC and the selected TOC style has the Title field filled in, then use the contents of the Title field.
  2. Otherwise use the word "Contents" localized to the language of the publication
jonaslil commented 4 days ago

We have noticed some remaining issues here (tested in v.20 prerelease builds):