Closed ghost closed 6 years ago
No. ARU-net is an horrible mess of a system full of postprocessing algorithms for questionable gains in a metric that is completely independent of recognition accuracy. The basic underlying network is a U-Net which is just another type of FCN, the difference between the residual blocks implemented here and their DenseNet-like construction is probably neglectable.
As the readme of the baseline branch says I'm trying to use a relatively simple convolutional network to expand baseline masks to complete lines without extensive postprocessing. Baselines are easily separable into objects while fine grained pixel labellings (as produced by the main branch net) are not. Using baselines as seeds for expansion allows object detection on closely located and intermingled objects, something most other object detection networks fail at. Unfortunately, while the neural expansion part works just fine in my tests, all the baseline detector architectures I've tried converge on empty output images even when using weighted gradients.
@mittagessen Thank you for your hard work, you are appreciated.
Currently,
ARU-Net
is the most accurate & advanced layout analysis method, it even supersedes DMRZ. This is a request, that you consider implementing ARU-Net. https://github.com/TobiasGruening/ARU-Net https://arxiv.org/abs/1802.03345The image below shows the results of the
ICDAR 2017 Competition on Baseline Detection
:The ICDAR 2017 cBAD training and evaluation dataset used (simple & complex): https://scriptnet.iit.demokritos.gr/competitions/5/1/ https://zenodo.org/record/835441 https://arxiv.org/abs/1705.03311