OCR4all / LAREX

A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
MIT License
177 stars 33 forks source link

Polygon shrinking #304

Open maxnth opened 2 years ago

maxnth commented 2 years ago

Add various polygon shrinking mechanisms like:

bertsky commented 2 years ago

BTW, there are implementations for this in various OCR-D processors already. It might help having a look:

  • Shrinking based on hull of child elements

see ocrd-segment-project's join_polygons, which uses Shapely to find the nearest points between each neighbouring polygon and then connects them with a straight line expanded into a rectangle (after finding the shortest path through all polygons efficiently).

  • Shrinking based on connected components

see ocrd-segment-repair's sanitize_region, which uses OpenCV's connectedComponents and findContours