daisy / pipeline-scripts

!! NOTE: This project is now part of the pipeline-modules project !! | Script modules for the default DAISY Pipeline 2 distribution.
GNU Lesser General Public License v3.0
6 stars 5 forks source link

epub3-to-daisy202: text-only daisy 2.02 should also have SMIL files #86

Closed josteinaj closed 5 years ago

josteinaj commented 9 years ago

Media Overlay should be generated for all content in the EPUB before converting to DAISY 2.02. That's a useful step for other scripts as well.

bertfrees commented 5 years ago

DAISY 2.02 has some special requirements for the SMILs so we may have to generate the SMILs after the HTML conversion. In either case the code should be written in such a way that it is easily reusable.

The requirements are:

Generating the SMILs from scratch is relatively straightforward. But "augmenting" existing SMILs is more challenging, because of possible granularity mismatches.

Headings needs to have their own par. So if an existing SMIL references segments within a heading, i.e. when it is too fine-grained, a solution is to merge all the segments in the heading. If the segments do not add up to the complete heading, or if the audio elements can not be combined because they reference different audio files or because the clips don't follow each other, we have to error out.

If an existing SMIL is too coarse-grained for the headings, we can also error out, because that seems unlikely to happen.

For page numbers however it is not so unlikely that the SMIL is too coarse-grained, because page numbers may appear inside paragraph or even inside sentences (or words). A solution could maybe be to skip the page numbers from the NCC in this case.

bertfrees commented 5 years ago

See PR https://github.com/daisy/pipeline-scripts/pull/153.