snaekobbi / issues

Common issue tracker for the Braille in DAISY Pipeline 2 project
0 stars 0 forks source link

Volume breaking requirements #7

Closed bertfrees closed 7 years ago

bertfrees commented 9 years ago

Volume breaking is a constrained optimization problem. We have to optimize an objective function with respect to some variables subject to (soft or hard) constraints. In this issue I want to find out what the variables, constraints and objectives are.

Possible objectives/constraints are:

In the following table I list up the possible motives for these objectives/constraints, and at the same time I draw parallels to line breaking and page breaking because they are such similar problems and similar techniques can be used to solve them.

line breaking page breaking volume breaking
limit size limited by page width (hard limit) limited by page height (hard limit)
  • volumes not too heavy
  • must fit in mailbox
  • limit on ring/wire binders
  • ideal size can be lower than hard upper limit
minimize count maximal use of available space maximal use of available space
  • reduce overhead (title pages etc.)
  • minimise the need for the reader to switch volumes
similar in size straight right margins? what about last line? equal bottom margins? what about last page?
  • for aesthetic reasons
  • in order to minimize size of largest volume for a given number of volumes
  • a significantly smaller (or bigger) last volume can also make sense when it contains exclusively an appendix
sensible breaks prefer at white space, not too many hyphens why? prefer at the beginning of sections
forced breaks why? why? why? aesthetic reasons?
disallowed breaks
  • avoid inside lists
  • preferably not in the middle of a sentence

The next table show possible ways of optimizing the solution and controlling the objective function:

line breaking (css spec) page breaking (css spec) volume breaking (css spec)
limit size max-length
minimize count hyphens: auto
similar in size min-length, max-length
sensible breaks
  • lefthyphenmin
  • don't hyphenate second to last line if last line has enough space
prefer at class A volume-break: prefer
forced breaks preserved line breaks (white-space: pre) page-break: always volume-break: always
disallowed breaks avoid if not at soft wrap opportunity volume-break: avoid
bertfrees commented 9 years ago

@snaekobbi/experts The problem of volume breaking is a tricky one because there are a lot of things I potentially have to take into account, depending on your exact requirements. As explained above, in this thread I want to find out what the constraints and objectives are of the optimization problem that volume breaking is.

Basically what I want to do, with your help, is fill in the top table, and in particular the third column. I've done some guess work about what possible constraints and objectives could be, but you have to confirm/refute them, explain the motives, prioritize them, etc.

It could be an interesting mind exercise to reason about line breaking and page breaking (first and second column) at the same time because in a lot of ways these are very similar problems. But what I'm mainly interested in here is volume breaking.

dkager commented 9 years ago

Line breaking (column 2):

Page breaking (column 3):

Volume breaking (column 4):

I'll brainstorm about this a bit with some of the braille people at Dedicon.

stesk commented 9 years ago

Comments on the table as it applies to volume-splitting:

Limit size: I think we have a hard limit of around 90 pages. (I'll get the specific number from someone.) Whether that's a preference or a printer limitation, I don't know. In any case it seems reasonable to have a way of specifying a maximum volume size.

Minimise count: Reduces overhead and minimises the need for the reader to switch volumes.

Similar size: For some reason an even distribution of pages is considered a benefit; for example, 120 total pages should be split 60-60 rather than 80-40. However, the current system tries to get each volume as close as possible to the maximum size and allows the last volume to be significantly smaller. We will have to agree on a preferred way of doing things.

Sensible breaks: Obviously desirable. I will try to find out if there is a hard limit to the variance in volume size due to breaks at headings. A small sample of productions indicates that volumes may contain as few as 68 pages (with 90 as the assumed maximum, so a permitted variance of at least 22 pages).

Forced breaks: Our current system does not support this, but I'm told it would be a great feature to have.

Disallowed breaks: Would this be CSS-configurable as well, or is it a fixed lists of elements?

mixa72 commented 9 years ago

Comments on the table in accordance with the SBS requirements:

Limit size: Our most frequently used binding is available in several sizes. The biggest one allows a maximum limit of 70 pages per volume (duplex). The smallest one should have at least 30 pages (minimum). Our ring binders in turn can take up to 90 pages. Books for children are usually delivered in a wire binding with a cover and are limited to 50 pages.

Minimize count: I completely agree with Steffen. Moreover, if the reader only needs small excerpts for the journey (exercises, music scores) he/she can order the book in form of a ring binder.

Similar size: Volume splitting is done manually at SBS, so the transcribers take care that the break points are at sensible places (before/after sections) = highest priority. Similar size is also an important criterion but after a discussion with some of my colleagues I came to the conclusion that it is done mainly for aesthetic reasons = nice to have. By the way, a significantly smaller (or bigger) last volume can also make sense when it contains exclusively an appendix for instance (footnotes, glossary, long index, etc.).

Sensible breaks: This has a very high priority. If it is not possible to place the break point before/after a section, it would be nice if the system could automatically generate boiler plate text in the book/TOC and let the reader know "Volume xxx. Continuation of chapter yyy".

Forced breaks: As we do the splitting manually now, this feature is highly desirable. We need to be able to override the system if the automated result is not satisfying.

Disallowed breaks: As we do the splitting manually now, this feature is highly desirable. We need to be able to override the system if the automated result is not satisfying.

bertfrees commented 9 years ago

Disallowed breaks: Would this be CSS-configurable as well, or is it a fixed lists of elements?

Yes I have an idea to make this configurable with CSS.

bertfrees commented 9 years ago

If it is not possible to place the break point before/after a section, it would be nice if the system could automatically generate boiler plate text in the book/TOC and let the reader know "Volume xxx. Continuation of chapter yyy".

Maybe a good use case for generated content (https://github.com/snaekobbi/requirements/pull/33)?

mixa72 commented 9 years ago

Yes, definitely, do you need an example for that?

bertfrees commented 9 years ago

Yes, would be nice, thanks.

bertfrees commented 9 years ago

I've updated the third column of the table with your feedback.

bertfrees commented 7 years ago

I'm closing this, I think we can say the requirements have been gathered.