Open tfmorris opened 8 years ago
In addition to signature numbers/marks, catch words will also need to be identified/removed. Page footers may occur in addition to or instead of headers.
Footnotes & endnotes are related, but require separate treatment, so I'll create a separate issue for them.
Older books may have signature numbers on pages which need to be removed or moved out of line. It doesn't appear that ABBYY's layout analysis reliable identifies them as being part of the bottom margin.
It's pretty good about tagging headers in the example I looked at (sample size = 1), but 2 or 3 did slip through (out of 150), so we'll probably need to be prepared to look for them as well.