Open bertsky opened 2 years ago
The first try in predict-async does not actually reduce wall time (it only reduces CPU seconds a bit). Perhaps we must first disentangle the page loop (make it a pipeline).
However, https://github.com/bertsky/ocrd_detectron2/commit/88617a25d3f847d65e8260391b27fda45ae55987 (i.e. predicting and post-processing at lower pixel density – no more than 150 DPI) does help quite a bit already.
We currently only use Detectron2's
DefaultPredictor
for inference: https://github.com/bertsky/ocrd_detectron2/blob/0272d95a930d5136bba29e530a3530c13ab17166/ocrd_detectron2/segment.py#L126But the documentation says:
One can clearly see how the GPU utilization is scarce, so a multi-threaded implementation with data pipelining would boost performance a lot.