Fix alpha shape #87

Open bertsky opened 2 years ago

bertsky commented 2 years ago

Some early fixes for the recent #77 – sorry to get back to you so soon @finkf (and thanks for merging so fast).

BTW, if you contemplate making a new release, here is a list of things that have changed since 0.1.5:


* align: fix logging and `--dump-json` #82
* align: avoid superfluous TextEquiv #76
* binarize: traverse regions in reading-order (so derived images are, too)
* ocrolib.morph.spread_labels: fix when no labels exist
* ocrolib.morph.label: :fire: fix ncomps (+1)
* ocrolib.morph.select_regions: fix `dtype` for just 1 label
* binarize/denoise/deskew/dewarp/segment: skip zero-size segments to avoid numpy problems
* common.compute_hlines/separators: fix h/v kernel size
* common.compute_hlines/separators: early length filter must be softer than final criterion
* segment: fix reference before assignment when partitioning
* segment: set correct pageId for image files
* segment (region level): fix and speed up horizontal merging
* segment (page/table level): continue more gracefully when recursive XY-cut fails
* segment (page/table level): fix significance criterion for partitions' line labels
* segment (page level): prevent empty `ReadingOrder` group
* segment (page level): avoid adding existing regions to RO group unless they are immediate children
* resegment: skip empty line polygons
* resegment: prevent overflow in numpy slices due to rounding errors when cropping #79
* resegment: set correct pageId for image files #80
* resegment: use `set_points` to ensure invalidating existing line images
* polygon_for_parent: ensure path validity before checking consistency
* polygon_for_parent: ensure valid polygons for new coords
* segment/polygon_for_parent: skip segment if polygon cannot be made valid


* clip: avoid suppressing overlapping components on both sides
* clip: require independence instead of `min_fraction` threshold
* deskew: delegate to OCR-D/core for reflection and rotation
* dewarp: expose `smoothness` parameter
* segment (region level): ignore separators and other existing regions
* segment: do not suppress neighbours if they cover the segment completely
* segment: do not suppress neighbours if already clipped
* segment (region level): annotate clipped images on region level, too
* segment (region level): improve horizontal merging (transitivity, don't cross separators, enlarge region mask, too)
* segment (page/table level): avoid grouping new text lines with existing regions in a XY-cut
* segment (page/table level): re-order grouped new and existing regions in a XY-cut
* segment (page/table level): avoid creating convex hulls for new regions if these would create additional overlaps with existing regions
* segment (page level): hmerge line labels (within each region) here, too
* segment: upgrade segmentation failures from warning to error
* ocrolib.morph: add `dist_labels` (distance transform of semantic segmentation
* ocrolib.morph: for CC analysis, use 4-way instead of 8-way connectivity
* ocrolib.morph: new function `rb_reconstruction` based on repeated dilation and masking
* common.compute_images/hlines/separators: use that instead of `spread_labels`
* re/segment: before spreading lines, assign diacritics to seeds below
* resegment: :tada: complete rewrite (now polygonal and global):
  - polygonal calculus instead of pixel/morphology operations (for efficiency)
  - optimise assignments globally instead of locally (to avoid conflicting assignments)
  - add param `level-of-operation` with new level `page`, also as new default
  - suppress all non-text regions and non-text non-regions before text line segmentation
  - on page level, merge horizontally adjacent labels, but avoid creating new region conflicts in doing so
  - general algorithm: 
    * after line segmentation, find contours and polygonalize, then compare overlaps once
    * allow assigning multiple new labels to existing lines and combine them via a (slightly concave) hull polygon
    * assign existing lines to new lines such that among those candidates covering high fg (90%) and bg (60%) shares of  the new line, the one with the largest fg and bg share of the existing line wins
    * bail out of resegmentation if the new polygon would loose a share of `threshold` fg or `threshold / 3` bg, or if some new, but unassigned line would be lost entirely 
    * subtract matching lines from non-matching lines
* resegment: allow detection of colseps if some regions exist already
* resegment: :tada: compute true alpha shape instead of eroded convex hull
* resegment: :tada: implement alternative `method=ccomps`:
  - calculate connected component analysis
  - calculate distance transform of existing labels
  - find new line seeds by flattening existing labels (via maximum distance)
  - propagate line seeds across connected components (by majority in case of conflict)
  - spread ccomps labels against each other into background
  - for each line,
    * if enough background and foreground wille be retained
    * find the hull polygon of the new line via alpha shape
    * annotate as new coordinates
* resegment: :tada: implement alternative `method=baseline`:
  - calculate connected component analysis
  - find new line seeds based on the existing baselines (by applying dilation above)
  - propagate line seeds across connected components (by majority in case of conflict)
  - spread ccomps labels against each other into the background
  - for each line,
    * if enough background and foreground will be retained
    * find the hull polygon of the new line via alpha shape
    * annotate as new coordinates
* segment (page/table level): :tada: improve splitting by separators:
  - when trying to partition slices by separators,
     * also treat pre-existing regions like separators, and
     * fix the condition on smallest allowed partitions (insignificant but complete lines)
  - fall back to (new) topological partitioning
    * when no cut or separator-split partition can be found for the current slice, then attempt to find another separator-split by grouping lines along their mutual horizontal neighbourship with fg separators
  - repeatedly allow both kinds of partitioning, if interspersed
bertsky commented 2 years ago

dammit, 8841abc contains accidentally commited parts that crash

bertsky commented 2 years ago

Here's some illustration of the recent improvements.

Resegmentation using method=baseline

before after
kraken-poly tmp_resegmented
1. use existing baselines (dilated mask) as seed 2. propagate to connected components by majority rule
tmpbavr88ig_baseline-seeds tmpea7l9xb2_majority-propagated
3. spread into background with full scale distance 4. propagate to connected components again (now catching more fg, esp. diacritics)
tmpn7zvrbl3_scale-spread tmpqskewmsn_propagated-again
5. spread into background with only half scale 6. polygonize
tmp1bkz9fwh_spread-again tmp_resegmented

Resegmentation using method=lineest (also annotating baselines)

before after
tmp_gt-lines tmp_resegmented
1. existing line labels with overlaps 2. new line labels
tmpbwx9o9xz_line_labels tmptdrw_vtg_new_line_labels
3. new baselines 4. match+assign parts and polygonize
tmp6c9g099v_baselines tmp_resegmented
before after
1. existing line labels with overlaps 2. new line labels
FILE_0002_BINSBBCROP-RESEG_linelabels FILE_0002_BINSBBCROP-RESEG_newlinelabels
3. new baselines 4. match+assign parts and polygonize

Resegmentation using method=ccomps

before after
1. use existing segmentation (flattened via maximum of distance transform) as seed 2. propagate to connected components by majority rule
FILE_0002_BINSBBCROP-RESEG_ccomps_lineseeds FILE_0002_BINSBBCROP-RESEG_ccomps_propagated
3. spread into background with full scale distance 4. propagate to connected components again (now catching more fg, esp. diacritics)
FILE_0002_BINSBBCROP-RESEG_ccomps_spread-full FILE_0002_BINSBBCROP-RESEG_ccomps_propagated-again
5. spread into background with only half scale 6. polygonize
FILE_0002_BINSBBCROP-RESEG_ccomps_spread-again FILE_0002_BINSBBCROP-RESEG_ccomps_pv

Page segmentation with improved separator detection and partitioning

1. input image 2. binarized (SBB)
filemax00005 BINSBB_0005 IMG-BIN
non-text detection
3. detect images 4. detect separators: medial axis transform
tmp1b0ol66t_images6_dilated tmpem1x1sm6_medial-axis
5. connected component labels of skeleton 6. filter by compactness and distance statistics
tmp2l69bqyc_skel-labels tmph88tuzz6_seps-raw
7. morphological closing of skeleton 8. link newly connected labels if direction is consistent
tmp1cvgsp0s_seps-closed tmp5l4uz2dj_seps-raw-linked
9. sort and filter candidates by size 10. propagate from skeleton to full components
tmpebwv73tl_sep-top tmpede5vokh_seps-top-propagated
11. spread separators into background 12. polygonize and suppress images+separators
tmprqzhwmme_seps-top-spread OCROREGIONSXYMASK-BINSBB_0005 IMG-CLIP
whitespace separator detection
13. vertical gradients 14. background
tmp8s3qpxlu_colwsseps2_grad-raw tmpx7vp4645_colwsseps1_thresh
15. combined bg seps 16. combined separator mask
tmpax4j6xsg_colwsseps3_seps tmphk8548zo_sepmask
textline detection
17. horizontal gradient 18. filtered lineseeds
tmpw93wdlk__gradmap tmp3d637l99_lineseeds_filtered
19. final ordered line labels 20. line labels spread against separators
tmpi14wb8ix_llabels tmp1m9raote_lineseeds_spread
textregion detection
21. final rlabels 22. final result
tmp3gsso1nh_rlabels OCROREGIONSXYMASK-BINSBB_0005 IMG-pv
1. input image 22. final (not optimal) result
