Integrate ability to extract certain areas from a given groundtruth data set (past OCR-functionalities).
Based on finest OCR-structs decide what must be included into the area-of-interest. Usually, as with newspapers, this means a specific column or article on a whole page.
Using the new pieces API, also to include the bottom-up struct repairs, i.e. modify lines only to fit enclosed words as well as modify regions to fit to probably changed lines.
Description
Integrate ability to extract certain areas from a given groundtruth data set (past OCR-functionalities).
Based on finest OCR-structs decide what must be included into the area-of-interest. Usually, as with newspapers, this means a specific column or article on a whole page.
Using the new pieces API, also to include the bottom-up struct repairs, i.e. modify lines only to fit enclosed words as well as modify regions to fit to probably changed lines.
Integration into model-module.