Closed stweil closed 5 months ago
The ground truth data was produced at UB Tübingen with Transkribus. The Transkribus PAGE export typically has rather bad line boxes, but usually the baselines are better. Therefore I wanted to try a 2nd training with --repolygonize
to see whether that works better. The log output shows a lot of problems with the data from the PAGE files and include line numbers, but not the filenames. Maybe that can be improved, too.
On 21/12/18 08:20AM, Stefan Weil wrote:
The ground truth data was produced at UB Tübingen with Transkribus. The Transkribus PAGE export typically has rather bad line boxes, but usually the baselines are better. Therefore I wanted to try a 2nd training with
--repolygonize
to see whether that works better. The log output shows a lot of problems with the data from the PAGE files and include line numbers, but not the filenames. Maybe that can be improved, too.
Hm sorry about that. A better way to repolygonize is to use the repolygonize.py script in contrib/ as that writes a new XML file. In any case, the transkribus exports are often a bit problematic as their baselines are floating below the actual baseline so the repolygonizing can sometimes produce completely flat polygons. There's already an auto-offset applied but at times it isn't enough. So better to verify the result of the repolygonization with a viewer or the line extraction script (contrib/extract_lines.py).
Apart from that, I'm currently rewriting the training code and introduced a binary dataset format which vastly accelerates training (100% GPU utilization without loader processes). It isn't finished yet but the basics are working so if you want check out the feature/binary_deteet branch and run:
$ ketos compile -f xml --workers 8 -o gt-fraktur **/*.xml
and run ketos train like usual, just with the -f binary
flag and the
binary file(s) instead. You can skip the loader thread argument, it
usually slows things down.
Curiously, I've noticed that I get far fewer segmentations errors, TopologyException
and Polygonizer failed on line X
, when I perform inference on the GPU.
Would it be possible to pad the images in order to run inference on GPU in batch? It would be a lot faster...
The --repolygonize
option has now been removed in main so the issue has become obsolete.
Training similar to a previous training but with the additional option
--repolygonize
fails. The training process aborts after spending much time with polygonizing.