ScanTailor-Advanced / scantailor-advanced

ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.
GNU General Public License v3.0
194 stars 8 forks source link

Feature request: detection of page numbers and select content #4

Open AlexJacobs1977 opened 2 years ago

AlexJacobs1977 commented 2 years ago

Please make the content algorithm better, so that page numbers (e.g. left bottom or right bottom) are detected and excluded from select content.

FriedrichFroebel commented 2 years ago

Whether page numbers are considered content or not probably is a matter of taste. In my case, I prefer to include them for completeness.

AlexJacobs1977 commented 2 years ago

Whether page numbers are considered content or not probably is a matter of taste. In my case, I prefer to include them for completeness.

When doing OCR, those page numbers are annoying, requiring manual deletion in an editor.