w3c / epub-specs

Shared workspace for EPUB 3 specifications.
Other
307 stars 60 forks source link

Which aspects of EPUB 3.1 break backward compatibility with 3.0.1? #993

Closed llemeurfr closed 6 years ago

llemeurfr commented 6 years ago

I don't know is the question has already been discussed with this angle in an complete way.

The document http://www.idpf.org/epub/31/spec/epub-changes.html listes changes, but may not be exhaustive about the added properties (added attributes like duration or opf:file-as are listed "en passant", the removal of -epub- CSS properties is not listed).

And in #984 there are mentions of different views on this aspect. Especially from @therealglazou

As an implementor, it is very clear, has always been very clear, that 3.0.1 and 3.1 processors are different.

and @murata2makoto

In the case of EPUB 3.1, I did (and still do) believe that it is simply incompatible with EPUB 3.0 but everybody argued otherwise. I do not want to repeat the mess.

For sure, the removal of some features, such as guide and bindings (opf), witch and trigger (content) broke backward compatibility.

Also, the evolution of the package version from "3.0" to "3.1" broke backward compatibility because a "3.1" user agent may refuse to handle a "3.0" package (it would be easy to correct it in the spec by allowing both values).

The removal of the EPUB CSS profile and its -epub- prefixed properties broke backward compatibility also.

But did deprecation (which means that means that authors should avoid it, epubcheck will raise and alert, but RS must continue to support it) really break backward compatibility? If we settle on the definition proposed in #984, it does not.

So, it would be interesting to list here what else in 3.1 broke backward compatibility, so that we can safely repair it in 3.x.x.

therealglazou commented 6 years ago

The removal of the EPUB CSS profile and its -epub- prefixed properties broke backward compatibility also.

Wait. This and stuff related to the XML files of a package live in two very different layers of compatibility. If the -epub-* CSS-like properties are not implemented, you will still see content. Maybe badly presented but the content is viewable. If for instance the processor chokes on the opf file, you won't even open the ebook...

Furthermore, the EPUB CSS Profile relies on an extension of the main browser's rendering engine, or the availability of that profile in a rendering engine. That's a very strong dependency that's almost not in our hands.

mattgarrish commented 6 years ago

Also, the evolution of the package version from "3.0" to "3.1" broke backward compatibility because a "3.1" user agent may refuse to handle a "3.0" package

Technically, the problem is the other way around. For lower versions, we've always had a must (even in 3.0):

[A Reading System] must attempt to process an EPUB Publication whose version is lower than "3.1".

(Of course, that's no guarantee in the real world and we couldn't check until there are supporting systems.)

A 3.0 processor, however, might not open a 3.1 because it is only a should to open newer versions:

It should attempt to process any given Rendition of an EPUB Publication whose Package Document version attribute designates a version higher than "3.0".

And this did prove true with at least one major reading system.

Otherwise, there's really nothing that makes 3.0 incompatible with 3.1 other than some dropped features.

It's still a fairly major revision, though, because it takes us off specific references to a number of specifications, so we're leaping forward from HTML 5.0 to 5.2 and SVG 1.1 to 2.0, for example. None of those changes were backwards incompatible as far as I recall, though.

Non-negligible costs of 3.1 have been noted elsewhere in these discussions, but it's a mischaracterization to suggest that this update will make them go away. Reading systems, authoring tools and validators will still have significant costs to upgrade and support these features in whatever we call the revised documents. And publishers are still going to be left waiting for support and dealing with people who use out-of-date 3.0 reading systems that don't support the new features.

llemeurfr commented 6 years ago

@mattgarrish we're leaping forward from HTML 5.0 to 5.2 and SVG 1.1 to 2.0, for example

  • Reading systems ... will still have significant costs to upgrade and support these features in whatever we call the revised documents.

Well, for EPUB 3.0.1 reading systems using a browser engine, I'm not seeing a huge effort supporting the few things we're talking about. The issue is more for other reading systems, not based on browser engines in constant evolution. And this is IMO one of the main reasons why EPUB 2 is still there: e-readers with custom rendering engines can't use "any" (X)HTML5 and "any" CSS3, so they are stuck with EPUB 2 or a "black and white" subset of EPUB 3. This is something we should discuss (in another thread).

mattgarrish commented 6 years ago

reading systems using a browser engine, I'm not seeing a huge effort supporting the few things we're talking about

Someone still has to do the work in the browser engine, though. Once that work is done it might not take a lot of effort to plug it in to a reading system, but the cost and time still exist. Some things will come quickly and others, like SVG 2, are off on the horizon somewhere.

And then you still have to get latest versions deployed and in use by significant numbers of people. As in the past, people will run to the vendors with their nice, 3.2-compliant content that validates to "3.0" but not to the version of epubcheck used by the vendor, or what they support in their reading system.

We should be careful in talking about doing another revision not to oversell what the change of package number will actually deliver relative to 3.1.

dauwhe commented 6 years ago

I made an EPUB just now with <package version="3.1">. Worked in iBooks, ADE 3.0, Azardi, Calibre, and (drum roll) Kindle Previewer.

(sorry, the markup in the original version disappeared)

mattgarrish commented 6 years ago

I made an EPUB just now with <package version="3.1">. Worked in iBooks,

That's interesting, because that was the one it didn't work in when I tried similar tests back when we were discussing this last year...

dauwhe commented 6 years ago

That's interesting, because that was the one it didn't work in when I tried similar tests back when we were discussing this last year...

This was desktop iBooks 1.2, so not a real test. Just trying what was closest at hand. Will try some more RSs.

dauwhe commented 6 years ago

This was desktop iBooks 1.2, so not a real test. Just trying what was closest at hand. Will try some more RSs.

Works in iBooks on my iPhone (iOS 11). And Readium Cloud Reader (localhost/desktop). And Aldiko (iOS 11)

JayPanoz commented 6 years ago

Obligatory word of caution.

Reading Systems may handle side loaded files very differently from distributed files. And they may not even bother checking the version in the former case, esp. when allowing EPUB3 features for EPUB2 files.

I’ve just side loaded an EPUB4 file in iBooks, the Readium Chrome app, Adobe Digital Editions 4, Aldiko, and Bookfvia for instance.

therealglazou commented 6 years ago

Is this proving that RS do not care about the version number and that conformance criteria on that ground in the spec are pointless because unimplemented, except by epubcheck and BlueGriffon?

JayPanoz commented 6 years ago

Well, it at least shows some efforts could be put into practical implementations, interop and compatibility. In issue #992 for instance, Ben Dugas (Kobo) explains:

As a retailer/eBook platform we ignore most OPF metadata of the publication date, author category and only read items that affect content display (rendition spread, RTL, ePub type). This is because we only want to read the ONIX/external metadata.

And I know Apple may be relying on a plist file which does the heavy lifting for metadata/rendition/interactivity (primary language, page-progression, touchHandling, scroll-axis, etc.).

So when you side load a file, you won’t necessarily get the same results (and implementations may be quick and dirty).

JayPanoz commented 6 years ago

But yeah you can probably assume the only thing preventing anyone from doing absolutely anything is ePubCheck (most Reading Systems will probably try to display contents even if the OPF is broken).

[Edit] Here’s what happens when the EPUB file is incorrectly zipped for instance.

mattgarrish commented 6 years ago

Reading Systems may handle side loaded files very differently from distributed files.

Sure, I wasn't inferring from Dave's testing that it actually supported 3.1. What is interesting rather is that it has gone from failing to load an EPUB with version="3.1" in the package file to loading one. I have no doubt Apple won't ingest a 3.1 into their bookstore yet, but work has obviously been done to accommodate newer versions. Another example of how we may be underestimating preparations for 3.1.

llemeurfr commented 6 years ago

At this step of the discussion, I still struggle to understand @therealglazou statement = "it is very clear, has always been very clear, that 3.0.1 and 3.1 processors are different.". Daniel, could you please detail this for us?

therealglazou commented 6 years ago

Daniel, could you please detail this for us?

The hierarchical organization of metadata done through refines attributes has a deep impact on the structure of an object model for a EPUB package. The removal of the OPF2 meta element and the changes in precedence order of linked records also had an impact. The guide element was removed (so a conforming tool should not deal with it nor output it) and NCX almost dropped (marked for removal IIRC). switch and trigger elements were removed and impose changes in content documents that used it. Many CSS epub-prefixed properties were removed in favor of more standard versions when existing, breaking compatibility too.

All in all, dealing with EPUB 3.0.1 and 3.1 is, because of the above, two different things. Some data were structurally changed, some were dropped, some were obsoleted. You can obviously reuse some bits from a 3.0.1 parser into a 3.1 parser and some bits from a 3.0.1 serializer into a 3.1 serializer, but they differ.

kevinhendricks commented 6 years ago

One sane approach is to use epubcheck output to "guide" the ebook developer into a good ebook format that will serve the most readers. Things like refines of refines, epub:switch, epub:trigger, and some of the more nasty aspects of of 3.0 that we really did not want in 3.1 can be given a warning similar to the following:

Warning: The use of refines of refines on metadata in the opf has extremely limited adoption in e-readers. Alternative, more broadly accepted, solutions should be considered in its place.

Warning: The epub:switch element has almost no adoption in e-readers. Alternative, more broadly accepted solutions should be considered in its place.

etc.

Given how epubcheck is religiously used to "vet" most commercial epubs (even warnings can be cause for rejection), you will be guiding people away from their use and helping to forge a sustainable future for the epub format, while not invalidating existing epubs already shipped.

As for fighting over metadata formats for user centred metadata (NOT page layout or direction) , this is silly. The opf can be parsed and can easily grok both metadata from epub2, epub3.01 and epub 3.1. Simply spec a superset of epub2, epub3.01, and epub3.1 metadata as your epub3.2 spec and again use epubcheck warnings to guide new ebooks to the place you want them.

You might want to keep the link element in metadata but restrict it to B2B metadata, and keep the B2C metadata most readers care about using the dc and dcterm base with some simple additions.

mattgarrish commented 6 years ago

Closing this issue as I don't see at something we can act on in the 3.2 revision. If breaking changes haven't yet been identified for undoing, we'll need separate issues for them.