w3c / publishingcg

Repository of the Publishing Community Group
https://www.w3.org/community/publishingcg/
Other
19 stars 8 forks source link

Digital Only Page Breaks #19

Open clapierre opened 3 years ago

clapierre commented 3 years ago

This relates to w3c/epub-specs#1599 when there are page breaks embedded by the publisher in the EPUB however there is no print equivalent.

I think we all agree that when a publisher embeds page breaks into their publication it is a better solution than no page breaks at all and leaving it up the the Reading Systems (even with guidance from us) as each may end up with a different algorithm and then citations and getting to a specific page will be problematic across different reading systems.

The big question is, what do we recommend to be added to the metadata to indicate these page breaks are "virtual/digital Only" in nature and not synced with a print page book? Or if the source of the pages are from a Word Document or PDF?

This could be addressed in the accessibilityFeature with a new value we we have total control over, however if we don't feel this is specifically an "accessibility" feature and more of a "usability" feature where does this go?

This could be addressed using the refines statement to the source-of "digitalPagination" or something like that?

mattgarrish commented 3 years ago

The big question is, what do we recommend to be added to the metadata to indicate these page breaks are "virtual/digital Only" in nature and not synced with a print page book?

Print page markers works fine as a term for static sources whether they've been printed out or not. We shouldn't proliferate terms just to achieve some small measure of technical accuracy.

The issue expressed previously is books that have no other source so there is no case in which one user has page breaks and another does not.

Just as you can't expect two entirely different versions of a print book to have the same pagination, you can't expect two different solely-digital works to have the same pagination. Even if we were to develop an algorithm it's unlikely they would line up given differences in front matter, text abridgement, etc.

In other words, if a professor tells you to get a digital edition and you go out and get an entirely different edition, that's your problem if it doesn't have pagination, or has different pagination. It's not an accessibility issue.

It feels like we're trying to push a marketing feature into the accessibility metadata. Buy our edition because that other book over there doesn't have any static locations you can all coordinate with.

But to turn this around, is there a reason why listing that these books have a page list isn't sufficient? What extra importance does listing page locations with no source add?

jenstroeger commented 3 years ago

The question I’d have is: how would you determine the locations of these “virtual” page breaks within the ebook?

Often times I work with the pages from the paged manuscript — while just an approximation it’s still a set of page breaks scattered throughout the book that helps the reader orient herself, that also help building an Index that has the look & feel of a printed Index.

I think ebook readers like the Nook have attempted to impose virtual pages onto the flow text as well, with more or less consistent results… 🤔

jenstroeger commented 3 years ago

Hmm… related issue https://github.com/w3c/epub-specs/issues/1542?

mattgarrish commented 3 years ago

how would you determine the locations of these “virtual” page breaks within the ebook?

There's no standard for this right now. A task force is looking at whether we could define a common way for authors to insert page breaks.

I think ebook readers like the Nook have attempted to impose virtual pages onto the flow text as well

Right, and this is why "virtual" page breaks is a confusing term. The pagination that reading systems produce is truly virtual whereas what we're talking about are authored page break markers without a source or a specific definition of a "page".

avneeshsingh commented 3 years ago

The issue was discussed in June 24, 2021 a11y task force meeting. The summary is as follows:

iherman commented 3 years ago

The issue was discussed in a meeting on 2021-06-24

View the transcript ### 2. Digital Only Page Breaks _See github issue [#1622](https://github.com/w3c/epub-specs/issues/1622)._ **Avneesh Singh:** The issue has gone in different directions. We have been discussing reading systems using algorithm for creating static pagebreaks and another discussion is about new accessibility metadata issue. … reading systems and distributerscan insert page numbers, but I think we should be concerned only with the static page numbers/marks provided by publishers because page navigation in different instances of the publication can be consistent if it comes from publishers. The implementation of auto generated page markers can differ from one distributor to another and from one reading system to another. … But the main topic of this issue is value ` for accessibilityFeatures=printPageNumbers` … should we add another for digital only? And does this work belong to this group or to the Publishing CG (a lot of accessibility metadata work is happening there)? **Matt Garrish:** abuse of `dc:source` if there is a print equivalent. we have two pieces of metadata `printPageNumbers` that you have a source that there is a source paper . … need to look at what we have and maybe we can provide page list, but this may not be the group to do this, and we need to formalize the schema.org metadata values. Where is the correct group to do this. and maintain this information and hash out these additional issues. **Avneesh Singh:** EPUB accessibility specs are normative, and schema.org metadata is not part of it, its only in the techniques. So may not be required to address here. **Tzviya Siegman:** Where did this start from IDPF? **Matt Garrish:** Originally this work was done from Benetech and WGBH > *Avneesh: we have control over the values. Regarding decision on the governance group for accessibility metadata values of schema.org, I will be sending an email with the investigation. **George Kerscher:** the locators WG we will be discussing this, the publishers need to do this and I don't think that anyone else shouldn't have the ability to do and could be issues with copyright, and for a11y Summary and virtual page breaks have been added as well as a page list. … virtual page #'s need to be added to do citations, and maybe they have an agreement to modify the books to add page #'s upon ingestion. **Avneesh Singh:** static page #'s inside the EPUB is best done by the publishers. … Regarding a11y metadata values we can do this here or the Publishing CG, decide who could take over the governance of these values and defer this issue for now.
mattgarrish commented 3 years ago

I'm going to transfer this issue to the publishing CG repository.