readium / webpub-manifest

📜 A JSON based Web Publication Manifest format used at the core of the Readium project
BSD 3-Clause "New" or "Revised" License
91 stars 23 forks source link

TOC may contain entries with text but no link #23

Open danielweck opened 5 years ago

danielweck commented 5 years ago

JSON Schema:

https://github.com/readium/webpub-manifest/blob/e289dbb14e1b3c519b175be9b8ff5c9ed4d22c42/schema/link.schema.json#L64-L66

For example, "Childrens Literature":

https://idpf.github.io/epub3-samples/30/samples.html#childrens-literature https://github.com/IDPF/epub3-samples/tree/master/30/childrens-literature

https://github.com/IDPF/epub3-samples/blob/50415310ce853c1bc52ceb19f0e3319b470bc524/30/childrens-literature/EPUB/nav.xhtml#L23-L25

<li class="body">
<span class="author">Abram S. Isaacs</span>
<ol> ...

Scroll down to TOC "title": "Abram S. Isaacs":

https://readium2.herokuapp.com/pub/L2FwcC9taXNjL2VwdWJzL2NoaWxkcmVucy1saXRlcmF0dXJlLmVwdWI%3D/manifest.json/show/all

(or if the above link fails to load: https://readium2.now.sh/pub/L2hvbWUvbm93dXNlci9zcmMvbWlzYy9lcHVicy9jaGlsZHJlbnMtbGl0ZXJhdHVyZS5lcHVi/manifest.json/show/all )

title, children, but no href:

{
    "title": "Abram S. Isaacs",
    "children": [ ... ]
}
HadrienGardeur commented 5 years ago

IMO the title of this issue is confusing.

It's perfectly fine having href required in the Link Object, the problem is elsewhere.

In EPUB, we could say that the ToC is a tree structure where some nodes do not have an href (is a title actually required?). In RWPM, the ToC is a tree structure as well, but nodes are required to have an href.

Since we're using the same Link Object everywhere, this means that when we parse an EPUB, we need a default value for href. Something like # might do the trick.

cc @JayPanoz as well since this is related to parsing

danielweck commented 5 years ago

In the TypeScript deserialization process there is a informative (non-breaking) "verifier" which checks that "no-href is okay if there are link children" ... but it's really just a workaround at the moment.

I would hate to change the EPUB parser at this stage, because of regression bugs. There are clients that consume / expect nil-href in order to render the TOC correctly (i.e. with a "group" heading and no actual link). If we now add "#" to missing href, these API consumers will break.

HadrienGardeur commented 5 years ago

Changing the Link Object for the same reason is IMO even worse since it affects pretty much everything in RWPM and OPDS 2.0.

We would have to duplicate this requirement for href (which we've had from the start, I've actually relaxed requirements for the Link Object over time) pretty much everywhere.

danielweck commented 5 years ago

Can't the JSON Schema check for non-nil children and allow nil-href in this case?

HadrienGardeur commented 5 years ago

It's always possible to craft something, but it goes against the core model (that we use everywhere right now) and it probably means that we wouldn't be able to re-use the Link Object JSON Schema anymore.

That's not a very attractive proposal IMO.