Closed GoogleCodeExporter closed 9 years ago
[deleted comment]
This is also needed for magazines and journals. The PRISM/PSV group bumped up
against this problem too, and it came up on a NISO web conference this week.
Having said that, I should point out that the title-type property "collection"
was specifically intended to accommodate a series name (a series is a special
case of a collection, namely an ordered collection) and we intended the
group-position property on <meta> to provide a volume number. So I'm not sure
this requires a change to the spec; maybe just add these examples to make it
clear how these were intended.
Original comment by bkasd...@apexcovantage.com
on 18 Apr 2013 at 9:38
I've looked at the spec and there are no real examples. The only thing I could
find seems very similar to ONIX (collection + group-position).
In that regard, ONIX is not useful at all. ONIX does a decent job at cutting a
title into multiple elements and assigning them orders, but this is completely
different than properly expressing the semantics for series information.
Original comment by hadrien....@feedbooks.com
on 22 Apr 2013 at 9:46
The section for dc:title
(http://www.idpf.org/epub/30/spec/epub30-publications.html#sec-opf-dctitle) has
two relevant examples: Lord of the Ring and The Great Cookbooks of the World.
I understand that this makes the design very generic (any title can have a
type, display-sequence and group-position) but it also makes things more
complicated:
- when people think of series, they don't think about them as a subtitle or
alternate title for the book, series are every bit as important as dc:title or
dc:publisher
- "collections" are used in a very different way in different countries, in
France for example we wouldn't think of series as ordered collections (even
though purely technically, they're indeed an ordered collection of books)
- defining the series information takes three elements (title and two
refinements) vs a single element for other key metadata information
Because of these three reasons, it feels like series are second-class citizens
in EPUB metadata, even though for manga for example, this type of information
is more important than the main title.
I believe that the same thing could be said about volumes, I doubt that people
think about dc:identifier first when they think about volumes.
Original comment by hadrien....@feedbooks.com
on 26 Apr 2013 at 1:03
On the issue of series being more important than title for manga, I would
concur and point out that the same is probably true for magazines: the name of
the magazine is most important (which in effect is the series name) and the
actual publication often doesn't have a title at all, just a number and a date.
If it helps further this discussion, here's what the magazine industry is
proposing for how to create a dc:title for an issue of a magazine being
delivered as an EPUB:
"Specifying the dc:title for a book is straightforward. But specifying the
title for other content, such as a magazine issue, is more complex. When
packaging magazine or other serial content as an EPUB 3, you will need to
combine fields from PSV to provide a descriptive title for eReaders to display.
Best Practice: The dc:title should consist of PSV’s prism:publicationName |
prism:coverDisplayDate | prism:edition | prism:issueName. The metadata fields
should appear in this order. Not all fields are always present.
Examples of a derived EPUB 3 dc:title for serial publications:
• All You | June 22, 2012
• Fortune | May 21, 2012 | U.S. Edition | FORTUNE 500
• Sports Illustrated | February 17, 2012 | 2012 SWIMSUIT ISSUE | DOUBLE ISSUE
• Time International | June 4, 2012 | Time Asia
Original comment by bkasd...@apexcovantage.com
on 10 Jul 2013 at 10:54
I found something interesting.
It seems that this is one of the most commonly used extension to our core
metadata vocabulary, since Calibre (a popular app for managing EPUB metadata)
has its own elements for series:
<meta name="calibre:series" content="CAC"/>
<meta name="calibre:series_index" content="29"/>
Original comment by hadrien....@feedbooks.com
on 11 Jul 2013 at 11:07
The way I read the specs, this is how you detail a series and issue number:
<dc:title id="collection">Scott Pilgrim</dc:title>
<meta refines="#collection" property="title-type">collection</meta>
<meta refines="#collection" property="group-position">1</meta>
The problem is readers that will ignore the refining, and think it is just a
standard title. iBooks for instance will take only the last title, whatever
title-type it is, for sideloaded books, and discard the rest, while iTunes, and
thus books sync'ed into iBooks with it, will do so with the first, and then for
books downloaded from iBookStore, iBooks will take whatever iBookStore tells it
to take, I think.
If you have a bunch of title-types to choose from, the safest thing to do so, I
guess, is put titles that would be appropriate for reading systems that ignore
the possibility of actually having more than one title of different kinds at
the first and last position, and cross your fingers. i.e.:
<dc:title id="title">Scott Pilgrim #1</dc:title>
<meta refines="#title" property="title-type">main</meta>
<meta refines="#title" property="display-seq">1</meta>
<dc:title id="subtitle">Precious little life</dc:title>
<meta refines="#subtitle" property="title-type">subtitle</meta>
<meta refines="#subtitle" property="display-seq">2</meta>
<dc:title id="collection">Scott Pilgrim</dc:title>
<meta refines="#collection" property="title-type">collection</meta>
<meta refines="#collection" property="group-position">1</meta>
<dc:title id="fulltitle">Scott Pilgrim #1. Precious little life</dc:title>
<meta refines="#fulltitle" property="title-type">expanded</meta>
Note that I do not bother abusing myself in such a way, as no reader that I
know bothers supporting what I see is the official way as per the standard.
Adding Calibre's meta tag does not hurt.
Original comment by chocolat...@gmail.com
on 12 Jul 2013 at 7:41
One thing to take into account is that that is a rather vague way of
identifying a series, and different series with the same title may end up being
grouped together.
Again, the specs are so broad one could argue it does provide a way to add a
unique identifier to that series by means of refining a dc:identifier with a
code from ONIX for Books, List 13, "Series identifier type code" (see
<http://www.editeur.org/files/ONIX%20for%20books%20-%20code%20lists/ONIX_BookPro
duct_CodeLists_Issue_21.html#codelist13>), but of course, again, I doubt any
reading system will ever care for such a thing:
<dc:identifier id="collectionId">value</dc:identifier>
<meta refines="#collectionId" property="identifier-type" scheme="onix:codelist13">value</meta>
Original comment by chocolat...@gmail.com
on 12 Jul 2013 at 7:54
We finally reached consensus on how we can express this metadata.
A few notes first:
- the goal here is to express that the current publication belongs to a
collection and express information about this collection
- this is different from what we have right now in the spec (which is only
about title, and enables content creator to divide a title into multiple
elements and provide information about each sub-element)
- our scope is a bit larger than initially planned, instead of focusing on just
series, we now support any kind of collection
Here's our list of MAY/SHOULD/MUST:
- a publication MAY belong to one or more collection
- a collection MUST have a title
- a collection SHOULD provide an identifier
- a collection MAY have a collection type (series, set, volume)
- a publication MAY provide its position within that collection
Here's an example of how this works:
<meta property="belongs-to-collection" id="pub-collection">Lord of the
Ring</meta>
<meta refines="#pub-collection" property="collection-type">set</meta>
<meta refines="#pub-collection" property="group-position">2</meta>
<meta refines="#pub-collection" property="dc:identifier">Unique identifier for
the set</meta>
We introduce a new primary expression named "belongs-to-collection" which
indicates that the current publication belongs to a collection and also provide
the title for that collection.
Using "refines", we can then provide additional information such as the type of
the collection, the position within that collection and the identifier for the
collection.
"group-position" and "dc:identifier" are already part of the current spec,
while "collection-type" has a controlled list of values. The proposed list for
now is: series, set and volume.
Original comment by hadrien....@feedbooks.com
on 19 Sep 2013 at 6:54
Spec additions to implement this proposal are available in following document:
https://docs.google.com/document/d/1pISSPSdHaUjdUZ3yL9LPipkHjbswZBWoTE3yjLQL7JI
The final proposal remove the value "volumes" from collection-type.
Original comment by mgarrish
on 21 Sep 2013 at 8:22
Specification has been updated per the proposal:
https://code.google.com/p/epub-revision/source/detail?r=4757
Original comment by mgarrish
on 27 Sep 2013 at 3:10
group-position removed from title examples:
https://code.google.com/p/epub-revision/source/detail?r=4759
Original comment by mgarrish
on 27 Sep 2013 at 3:18
Original comment by bkasd...@apexcovantage.com
on 17 Oct 2013 at 8:42
Where is this reflected in the draft of EPUB 3.1 of Jan 30th 2016?
Edit: Rereading the metadata chapter of the draft , I guess series metadata now goes out of the OPF and one is supposed to detail it into a separate ONIX XML file or some other metadata standard bundled within the EPUB archive.
The Package Document is not designed to provide a comprehensive bibliographic record, and is not the correct location for such discovery information about the EPUB Publication. Metadata records, both that conform to international standards or that are designed for custom use, can instead be associated using the link element.
Why was this removed in EPUB 3.1? ONIX XML seems like an overkill for a super common thing for books. (Series are after all one of the most purchased book types, no?)
Also, similar to NCX, ONIX is an entirely different beast of a format which isn't explained in EPUB itself, making it a lot more complicated to include what should be pretty straightforward information... (and making it less likely that reader systems actually implement checking for this info)
I don't see how optional additional meta tags that add simple fields make things complicated. And how can there ever be large developer adoption if it's just in one minor version of the standard and then taken out again immediately with the next iteration?
I'm aware ONIX does a lot more, but that's also why I think it's not a good match. (and if you push people to ONIX, I don't see how metadata gets "simpler" at all.)
Did you ever do a survey of the users if they'd be interested in series information being shown? I think that would be a much more helpful idea than to survey the current state of developer adoption.
The idea was originally to have linked schema.org records which could be indexed in a web-friendly version of EPUB. (EPUB 3.1 had a number of goals that weren't fully achieved, or won't be until 4.0.)
But we made a mistake with 3.0 of getting into metadata vocabularies, and were warned not to at the time. EPUB should have only provided the framework for expressing metadata, but we caved in and added some "starter" properties to address a few needs. Dropping the unused properties was an attempt to move away from that approach while retaining what was actually used.
If there are holes in the metadata, the W3C publishing group should work with schema.org, for example, to ensure that the CreativeWork classes have necessary metadata instead of always trying to build things in isolation, so then you could use a meta tag to express the series title.
Series almost never used by reading systems for the user library (as shown per survey to developers)
Then I can only assume Calibre users were not included in this survey.
Or was this a survey of developers, to see who had implemented the feature? A point of interest might be how many reading systems have adopted Calibre’s non-standard series metadata, and how much of a draw that feature is.
The idea was originally to have linked schema.org records which could be indexed in a web-friendly version of EPUB. (EPUB 3.1 had a number of goals that weren't fully achieved, or won't be until 4.0.)
Following https://idpf.github.io/epub-guides/schema-org-integration/, what would this look like? As best as I can read this, these lines would be somewhere in content.opf:
<meta property="rdf:type">http://schema.org/Book</meta>
…
<meta property="schema:isPartOf">Discworld</meta>
<meta property="schema:position">37</meta>
(which maps nicely to Calibre’s series
and series_index
, at least if decimal numbers are used in the position
field).
ONIX is a B2B (publisher to bookseller) metadata vocab. What is missing in EPUB is IMHO a complete B2C metadata vocabulary (user facing, ready for client-side filtering, search etc.).
Yea the problem with using external standards is that you really need examples or it'll be quite hard to guess how to integrate it. If this is actually still possible but with an external addition, it would be really helpful if this could be shown somewhere in more detail (without just going "just look at ONIX").
As best as I can read this, these lines would be somewhere in content.opf:
That looks like the appropriate tagging for a series, but I work more on the accessibility side so they're not properties I've applied.
We anticipated questions of common practice by starting this guide: https://idpf.github.io/epub-guides/package-metadata/
I think it would be helpful to document this case. @laudrain ?
@HadrienGardeur any thoughts you have here on series tagging would also be helpful?
@mattgarrish sure, this is how we handle them in the Readium Web Publication Manifest: https://github.com/readium/webpub-manifest/tree/master/contexts/default#collections--series
The serialization would be different but the infoset remains the same.
In Readium-2 some implementations are already capable of extracting Calibre metadata for series by the way, that's the case in Go at least.
Today for EPUB, we should not reinvent any metadata language. I agree with @mattgarrish to use schema.org and contribue if there are holes.
Schema.org has properties for BookSeries http://schema.org/BookSeries in CreativeWork > CreativeWorkSeries > BookSeries: "A series of books. Included books can be indicated with the hasPart property."
There is provision in http://schema.org/Book for isPartOf CreativeWork. Not an expert, but I hope it can link to the a CreativeWork > CreativeWorkSeries > BookSeries. And http://schema.org/Book has also a position : "The position of an item in a series".
We also use schema.org for the Readium Web Publication Manifest, but that's through a JSON-LD context.
As far as I'm aware, a number of reading systems definitely support series. For instance that's the case in Aldiko and in quite a few comics apps.
The Readium example which @HadrienGardeur provides allows more than the Calibre syntax (or my understanding above of how to use schema.org): Calibre allows only one series
and one series_index
. But of course that version requires bibliographic metadata in yet another file, and in yet another format—lots of reader programs support Calibre’s series metadata within content.opf (Mantano’s Bookari, e.g.), but does anything at all on the market support metadata in other files and in other formats?
@jcsalomon I'm not suggesting another file/format.
What I've linked is what we use internally in Readium-2 (an SDK for reading apps), and this is also a proposal for EPUB4/WP.
For an EPUB 3.x revision, it's probably easier to express the same info using schema.org (which we're using behind the door anyway in Readium).
Calibre allows only one series and one series_index.
For an EPUB 3.x revision, it's probably easier to express the same info using schema.org
And it will be true for the EPUB 3 package document that you can only express one series, too, at least if you want to indicate the position. Without the ability to group, the positions would become ambiguous to a machine ("refining" being a terrible option here, as usual).
While this is unfortunate, I would assume a book being part of one series covers most use cases, right? Or how common is it for a book/written work in one specific release to be part of multiple series?
I can only imagine that e.g. for a book part of a story series released as part of some "best of collection" - but in that case it should work as a standalone story anyway if it's singled out like that, and just annotating it as part of the collection should be an acceptable approximation I think.
Another alternative would be to turn positions into a first-class attribute in the next EPUB revision.
We'd get something almost on par with Readium:
<meta property="schema:Series" opf:position="2">Discworld</meta>
IMO, there is a good case in favour of this:
The opf:position
attribute could also work on other elements, such as dc:title
to replace the sequence order that we had through refine in 3.0.x.
This would also be a clear path forward towards EPUB 4 if the WG ends up adopting the RWPM for the WP manifest.
@HadrienGardeur but this custom namespaced element's data would fall completely out of the reach of any RDFa parser--whereas the variation proposed by @jcsalomon in https://github.com/w3c/publ-epub-revision/issues/326#issuecomment-361309750 stays within that parsing/data-model space.
If you put it in a separate XML namespace, you'll have to come up with separate methods to extract/parse/manage/understand it.
@BigBlueHat I think that's mostly irrelevant in the case of EPUB 3.x:
IMO something straightforward and powerful enough is better than trying to achieve RDF purity.
Ideally we'd just have:
<series position="2">Discworld</series>
Hm. Maybe it should still have some sort of obvious reference to schema.org/Book (which a <series>
tag might not obviously have) even just to make it clear where it comes from?
Otherwise, this is circling back to putting all metatags directly into the EPUB definition instead of using reasonable existing things.
I think something like <meta property="schema:Series" opf:schema:position="2">Discworld</meta>
which obviously refers to the book schema might be a better idea. While that would still require demonstration in the meta guide here https://idpf.github.io/epub-guides/package-metadata/ it wouldn't be such a huge departure from the book schema.org thing that every property of that schema would require a demonstration like that.
I just can't get into the idea of minting more metadata that's unique to EPUB, especially not new elements. Let's work with what schema.org provides, and live with the limitations of implementing in the package as it's defined for EPUB 3.
I wrote the integration guide while trying to implement the accessibility and educational metadata, which is why it defaults to CreativeWork and recommends rdf:type for any other instances. Realistically, it's not the optimal choice and we should treat the package as an instance of Book by default. That's where the most useful additional properties are.
I think this is something we may want to note in the specification and not farm out to an informative guide. To an extent Hadrien is right that it doesn't matter what you do unless something is parsing out a graph or translating to compliant schema.org metadata, but leading people blindly to bad practices isn't a great idea, either. Assuming an EPUB 4 that has a real metadata framework and uses schema.org metadata, it's going to come back to bite people upgrading their content.
@mattgarrish frankly, it barely matters in the context of EPUB 3.x. If it's useful for a significant portion of the community, we might as well have it in our own namespace.
As for EPUB4, I'm advocating for a solution based on JSON-LD and schema.org myself (RWPM) but the other option out there (WAM) is not tied to any existing vocabulary.
I don't remember what the spec has to say about XML attributes, but can't we at least use schema:position
as an attribute and avoid repeating meta
like in the dark ages of EPUB 3.0.1 (with its infinite refine nonsense)?
<meta property="schema:Series" schema:position="2">Discworld</meta>
I don't remember what the spec has to say about XML attributes
They aren't allowed unless explicitly defined. There's no real extensibility of the package metadata beyond being able to use the package/@epub:vocab
and meta/@property
to reference vocabularies/properties.
Additions along the attribute axis are less invasive than new elements, but how would such a thing work? Do we allow any schema:*
attribute and leave it to implementers to figure out how to make it work, or do we only add schema:position
, in which case wouldn't it be a little odd that the attribute is available no matter what property you're expressing?
My worry with the first option is that it could make it impossible to translate the metadata. Refines is messy, but there is some (untested) logic to how it can be translated.
Looking at schema.org, doesn't it also enforce the one-series flaw of the package document? Since isPartOf
and position
are both child properties of CreativeWork/Book and not a set, a publication can't have an unambiguous position in more than one series.
The BookSeries class doesn't solve this problem, either, although it lets you say a lot more about the series. (Using the title of the series as the value of isPartOf
is a technical violation, but schema.org isn't strict in enforcing the expected type of any property.)
Unless I'm missing something, my inclination would still be to live within what is possible to do now.
@mattgarrish schema:Series
is a CreativeWork as well, which means that it can have a position
.
Looking back at our EPUB 3.1 WG, we did end up introducing a number of new XML attributes, in order to express a number of things that required refine before:
opf:alt-rep
opf:alt-rep-lang
opf:role
opf:scheme
opf:authority
opf:term
What we never ported to attributes is the ability to express a position in a list or sequence. EPUB 3.0.1 had this for titles for instance, here's an example from the spec:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title id="t1" xml:lang="fr">Mon premier guide de cuisson, un Mémoire</dc:title>
<meta refines="#t1" property="title-type">main</meta>
<meta refines="#t1" property="display-seq">2</meta>
<dc:title id="t2">The Great Cookbooks of the World</dc:title>
<meta refines="#t2" property="title-type">collection</meta>
<meta refines="#t2" property="display-seq">1</meta>
<dc:title id="t3">The New French Cuisine Masters</dc:title>
<meta refines="#t3" property="title-type">collection</meta>
<meta refines="#t3" property="display-seq">3</meta>
<dc:title id="t4">Special Anniversary Edition</dc:title>
<meta refines="#t4" property="title-type">edition</meta>
<meta refines="#t4" property="display-seq">4</meta>
<dc:title id="t5">The Great Cookbooks of the World:
Mon premier guide de cuisson, un Mémoire.
The New French Cuisine Masters, Volume Two.
Special Anniversary Edition</dc:title>
<meta refines="#t5" property="title-type">expanded</meta>
…
</metadata>
I believe that we could create a new attribute that would work for title
as well as http://schema.org/Series and http://bib.schema.org/Collection.
This new opf:position
would express the position of an element in a sequence/list.
Here are a two examples:
<title opf:position="1">Flatland</title>
<title opf:position="2">A Romance of Many Dimensions</title>
<meta property="bib:Collection" opf:position="26">SF Classics</meta>
<title>Guards! Guards!</title>
<meta property="schema:Series" opf:position="8">Discworld</meta>
<meta property="schema:Series" opf:position="1">City Watch</meta>
(BTW, this is a valid example of a book where two series are useful, since you could either read all of the Discworld series or just the one focusing on the City Watch. There are quite a few similar examples in fantasy or SF series.)
@HadrienGardeur's proposal is appealing, much easier to understand than the use of the refine attribute. But if we don't deprecate refine in 3.2 because of its use by the Japanese publishing industry, can we still make so that we don't end up with two alternative way to do the same thing? In other words does the Japanese industry use refine with property = display-seq? If to at least this value could be deprecated...
schema:Series is a CreativeWork as well, which means that it can have a position
That position would be the position of the series in something else to which it belongs, not the position of something which belongs to the series. Series is its own class that just describes the series. The publication is part of a series, but its position is unique to itself and expressed within its own class, which can only be done once.
I'm also leery of mixing ordering of elements together with position within a set. They're incongruous concepts. How does a reading system know when you're referring to display sequence or when the number is just a bit of information that is supposed to be displayed? That's why we separated display-seq
from group-position
.
But this may all be moot, since 3.2 is returning refines and with it has to come belongs-to-collection
, collection-type
and group-position
. We may as well keep using those.
Frankly, the semantics for schema.org can be a little fuzzy as well. A position on a CreativeWork itself means nothing, it has to be a position in the context of an ordered list.
For display-seq
vs group-position
, I think that we're overthinking. Once again, we're barely RDF-ish, I'd much rather have less attributes and straightforward metadata expression than semantic purity (which we won't achieve anyway).
I would also get rid of either opf:scheme
or opf:authority
, a single attribute can IMO work for both use cases.
The return of refines and all associated properties is just sad, they're terrible to work with and it feels like going backward.
A position on a CreativeWork itself means nothing, it has to be a position in the context of an ordered list.
That's the flaw I mentioned above by having them both as direct properties of each class in schema.org. isPartOf
says which series it belongs to and the completely detached position
gives its position in the series. They should be grouped to avoid the ambiguity, but in such a way that they are their own unique datatype for some other property like belongs-to-collection
. That's the problem of a natural growth vocabulary. Too much is sometimes thrown at the wall too quickly.
I hate properties that change meaning depending on context, though, which is why I don't agree it's overthinking to have two. With display-seq
, the set is fully defined in the metadata (it's an ordering property for like things); group-position
indicates belonging to a set that is defined somewhere else. If you squash them together, I can't see how you can be sure of anything unless you code the logic for every instance.
I do agree that the return of refines is sad; I was glad to ring its death knell in 3.1. But it does allow multiple unambiguous series titles and positions, again.
No opinion here on opf:scheme
and opf:authority
. Didn't we introduce opf:authority
because we were trying to keep opf:scheme
compatible with its use in EPUB 2? At any rate, I believe these are all toast in 3.2 since we're reverting to refines.
But fwiw, display-seq
should be deprecated in 3.2. As far as I remember, we defined document order as the indicator of display for various elements since reading systems use that anyway.
@llemeurfr wrote
But if we don't deprecate
refines
in 3.2 because of its use by the Japanese publishing industry,
How are they using it (besides for series metadata in e-comics)? Whatever alternative is proposed must accommodate their use-case.
And while refines
is un-XML-ish and complex usages can get unwieldy, I’m a small-time book producer (made four books for an author friend) and I’m reasonably tech-savvy: I figured out the refines
format for series metadata in a few minutes of reading the documentation.
I also spent a few days reading the 3.1 standard and chasing down one outside reference after another and could not figure out what the 3.1 Way was. Well, I partly could: include in content.opf a link to a metadata file to be stored elsewhere in the directory tree, this metadata being in any of a dozen formats used by various cataloging schemes but which are neither easily readable by mere humans not writeable without flipping back-and-forth between multiple cross-linked standards documents, and knowing that even if any e-reader implemented the 3.1 standard chances were against it also understanding whichever additional metadata format I might chose.
(Feature request for an EPUB 2 e-reader: “Here in two lines is Calibre’s extension; can you please interpret these and let the user categorize books by series?”
(Ditto for an EPUB 3 e-reader [adapted from an actual feature request I submitted]: “Here in three lines is EPUB 3’s series standard; can you please interpret these and let the user categorize books by series?”
(Ditto for an EPUB 3.1 e-reader: Don’t make me laugh.)
You want the cataloging metadata out of content.opf? fine, give me another place to put it. You don’t like refines
? fine, give me another format to work with. But—
After giving this some more thought, I think a big issue with EPUB 3+ is <meta property="schema:numberOfPages">227</meta>
is completely unlike the examples schema.org gives which are <span property="numberOfPages">224</span>
(RFDa) and <span itemprop="numberOfPages">224</span>
(microdata). That EPUB 3 appears to use schema.org, and most of it says "just check schema.org for what you can use" but then didn't reuse one of the formats that schema.org actually explains is a small disaster, in my humble opinion.
By the way, the schema.org JSON-LD variant looks infinitely more readable than all the weird EPUB-meta-tags-and-refines-mixed-with-schema.org-but-different:
{
"@context": "http://schema.org/",
"@id": "#record",
"@type": "Book",
"additionalType": "Product",
"name": "Le concerto",
"author": "Ferchault, Guy",
"offers":{
"@type": "Offer",
"availability": "http://schema.org/InStock",
"serialNumber": "CONC91000937",
"sku": "780 R2",
"numberOfPages": 134,
"offeredBy": {
"@type": "Library",
"@id": "http://library.anytown.gov.uk",
"name": "Anytown City Library"
},
"businessFunction": "http://purl.org/goodrelations/v1#LeaseOut",
"itemOffered": "#record"
}
}
Part of the problem with the meta/refines/nonsense is that you can't nest properly, and that due to always writing out "meta" and "refines" as full words, it becomes super lengthy and complicated really quick.
Maybe it would be best to consider adopting the exact schema.org JSON-LD standard as an additional file or inline JSON inside a tag? Or at least pick one of the exact other formats as specified on schema.org. I don't think referring to schema.org for the contents but completely redefining the syntax inside EPUB with complicated meta/refines nesting stuff does anyone any good.
I mean, look how lengthy this is just to explain how to nest things: https://idpf.github.io/epub-guides/schema-org-integration/#h.8w8btnbwlf6r Wouldn't it be a lot easier to just write use <schema-org-meta> .... JSON-ld contents here ... </schema-org-meta> and everything about the syntax is on schema.org, here is a single non-trivial example in a complete OPF: ...
for everyone involved?
Any thoughts on this? I really think if you want to use the schema.org standard, you should also use e.g. the JSON-LD syntax variant exactly as specified there (or one of the others) - or alternatively, specify metadata directly in EPUB without referring to schema.org. The EPUB format already has quite the complexity, you're not really doing it any favors by adding another complicated standard, and then completely deviating from its own example pages of how to use it...
As a publisher, it is unclear to me what the consensus is on how to implement a series and number, at this point in the discussion. I understand the calibre metadata tags and will use those, but could someone summarize/point out what I should do to be standards-compliant as of Nov 2018? That would likely be helpful to others. If there is zero consensus, then the most popular way to indicate series and number would be helpful.
@jeffmcneill, since the “new” EPUB3 is mostly a reversion to EPUB 3.0.1, that’s the model to follow. See https://w3c.github.io/publ-epub-revision/epub32/spec/epub-packages.html#sec-belongs-to-collection for the spec, though the examples at https://w3c.github.io/publ-epub-revision/epub32/spec/epub-packages.html#group-position are perhaps a bit more complete. Basically:
<meta id="num" property="belongs-to-collection">Series Name Goes Here</meta>
<meta property="collection-type" refines="#num">series</meta>
<meta property="group-position" refines="#num">1</meta>
Caveats:
set
instead of series
. The example shown puts a Harry Potter book in a set
, but would that have been the right decision when the books were still being released? How about a book series whose final length is not known at the moment?group-position
is “A single xsd:unsignedInt
or series of decimal-separated numbers (e.g., 1
or 2.2.1
).”)<meta property="belongs-to-collection" id="c02">Harry Potter</meta>
<meta refines="#c02" property="collection-type">set</meta>
<meta refines="#c02" property="group-position">2</meta>
<meta refines="#c02" property="dcterms:identifier">urn:uuid:99999999-8888-7777-6666-555555555555</meta>
but I have no idea whether any reading system implemented actually cares about that. Probably can’t hurt, though.
Original issue reported on code.google.com by
hadrien....@feedbooks.com
on 27 Mar 2013 at 3:03