DIYBookScanner / spreads

Modular workflow assistant for book digitization
GNU Affero General Public License v3.0
126 stars 53 forks source link

Metadata support #9

Closed jbaiter closed 10 years ago

jbaiter commented 11 years ago

Add support for Metadata to the spreads workflow. Users should be able to specify the usual information about digitized content, like author, title, publisher, year, edition, etc. One approach would be to add a get_metadata function to the workflow, which takes keyword arguments as input and forwards them to plugins, which will return a SpreadsMeta object, that can then be further processed. This can then be used to enrich the various output formats and to create more meaningful filenames.

jhub95 commented 10 years ago

How are you thinking of storing the metadata? In the db or maybe XML flat file in the workflow's directory? How will users copy the metadata into their book reader applications?

jbaiter commented 10 years ago

I honestly don't think that spreads should directly deal with book reader applications. I think a good strategy would be to have a meta directory where metadata plugins could store their relevant files, this will most probably be XML. Output plugins could then look into that folder and use the metadata to enrich the generated (PDF, ePub, etc) output files. This is on my ToDo-List for after the postprocessing server has been completed, I'll probably do a METS plugin as a reference implementation once I've got the API figured out.

atomotic commented 10 years ago

me too agree that reader apps are out of the scope of spreads.
regarding metadata and packaging, i suggest to take a look to curation microservices, specifically bagit[1] and namaste[2], that are simple conventions to package things inside a directory. python implementations are rather mature [3][4]

[1] https://wiki.ucop.edu/display/Curation/BagIt [2] https://wiki.ucop.edu/display/Curation/Namaste [3] https://github.com/libraryofcongress/bagit-python [4] https://github.com/edsu/namaste

jbaiter commented 10 years ago

This has been implemented in https://github.com/DIYBookScanner/spreads/commit/c258ab7ac7391eeecdae74929db303d037509cf5.

The current version includes the following functionality:

Currently the metadata system does not have any plugin hooks, but I intend to add that at some point in the future. Additionally, the only UI that supports it at the moment is the Webinterface, the rest still work but don't have the option to enter any metadata.