thoth-pub / thoth

Metadata management and dissemination system for Open Access books
https://thoth.pub
Apache License 2.0
45 stars 9 forks source link

Support storing "additional resources" in Thoth #302

Open rhigman opened 3 years ago

rhigman commented 3 years ago

Books may come with "additional resources" separate from the main PDF, EPUB etc editions (e.g. https://www.openbookpublishers.com/product/498, which comes with several audio and text files as listed on the Additional Resources website tab - the audio files are integral to this particular book, the text files are supplementary). No information about these resources is currently stored in Thoth.

These could potentially be represented as "book chapters", i.e. child elements linked to a parent work (support for this to be added under #28).

Storing them in Thoth would mean a) easier automation of archiving (e.g. #289) and b) less need to manually edit book pages in the new OBP website (which may also eventually be used as a template by other Thoth-based publishers).

ja573 commented 3 years ago

Since they are normally excluded from metadata distribution we should model them separately.

Is the idea to include any additional resource (e.g. blog posts or further reading by the author) or only the ones that are coupled with the book (embedded or heavily referenced)?

rupertgatti commented 3 years ago

I'm not sure I agree with the first sentence actually!

But - there are two levels of additional resources to consider.

a. material that ideally is embedded direct into the work (such as the music files in Diderot) - this material forms an integral part of the work - and so is genuinely a 'child' of the work - and metadata associated with this 'should' be available. That is - the book is not 'complete' without access to this material. This is also really important for archiving and preservation reasons. (on a side note - at some level one could argue that images (and even specific paragraphs) are also integral parts of the work - and indeed we could create metadata for images as well if we wanted (taken, say, from the list of illustrations ... but I don't see that as particularly importatn presently as once embedded into any standard format we can be confident that they will remain connected. That is not the case with audio/visual content - which is why we may need to consider those seperately at this point.

b. A second level of resources are the genuinely 'additional' resources - which are nice to have, are potentially referenced in the work, etc. Do we allow those to be included within Thoth? Maybe we can create a way - but this I do see a different category of content. Clearly the use case here is as Ross said - how do we include them on the website if not via Thoth. But this typeof content is (for me) more closely aligned to any of the material referenced in the work - be it an image in the British museum, or a text in a library. The reason they are on the OBP website is that they are not easily references/accessed anywhere else. So a route for a publisher to host them elsewhere would be useful - but they are not considered integral to the work itself.

ja573 commented 3 years ago

We'll most likely want to have a separate resource table with a type (e.g. image, blog, audio, etc.) and some sort of flag that indicates if the resource is part of the book or not. The question will be to determine whether these are linked to the individual chapters or the book as a whole, depending on how platforms will consume this data.

rupertgatti commented 2 years ago

George Corbet's book Annunciations is an interesting case: https://www.openbookpublishers.com/books/10.11647/obp.0172 The recordings of the the musical passages follows at the end of the composures reflections and preceeding the score. The recordings are provided with its own DOIs if they were a chapter (and we needed to register the metadata as part of the CrossRef submission - so this should be part of the CrossRef output). Presently the individual recordings are hosted by OBP, but are not available to download from the OBP website (presumably, as they don't have a Thoth entry) - but they should be!