QUIVER dashboard for OCR-D

Very General Desciption

Kwalite dashboard is designed to fulfill different use cases of different user groups. In general, the dashboard is intended to provide an overview of the essentials of the rather complex and numerous repositories of the OCR-D project.

Technical Approach

The kwailee dashboard reads json files of individual processors. This json data must be automatically created in advance. So there are two tasks: The display in the dashboard and the execution of the measurements of the processors or repos (smoke tests, ocr-d tool json (contains processor metadata) validate, lock files check (e.g. to read versions)). Measurements are done via CI jobs, automated via cron job. For benchmarking parse NextFlow report (probably comes as html, not sturctured). To benchmark whole workflows, the values for 3 plants, 3 models, 3 workflows have to be compared.

The Implementation basis was developed by Konstantin http://kba.cloud/ocrd-kwalitee/

User Groups and Storys

User Group 1: OCR-D developers -> PROJECTS TAB

The first user group is the ocr-d developers, who are responsible for maintaining the ocr-d software, updating it during ongoing development, but also providing it in the long term.

For the user group of the OCR-D developers the so-called projects tab was conceived. Since the coordination project has to define the actual product and ensure its quality in the third phase, this component of the board is implemented first.

User Stories

As an ocr-d developer, i want to have an overview of the repositories and their development status.

[ ] Feature 1: Information: Link Readme & Changelog

As OCR-D developer I want to see directly which projects have been updated since the last ocrd-all release, so I can update missing ones.

Feature 2: Filter for different criteria

[ ] Filter before and after ocr-d all release
[ ] Filter by "release with version number" and "just commits to master".
[ ] projects based on bashlib
[ ] Projects based on python

As an OCR-D developer, I want to be able to see directly which projects were part of the last release, so that I can release those that were not.

[ ] Feature 3: requirements/ dependencies feature

will be realized as conflict list (step 1: json with the info, 2. info will be shown in dashboard too)

_Example of a common error that prevents a release (for better understanding): Different projects have conflicting dependencies, this only becomes apparent at runtime. With a visualization of differing versions in the requirements.txt as well as the actual installed version in the ocrdall venv, such problems can be identified and fixed. Example: Project A needs tensorflow 1.0.*, project B needs tensorflow 2, tensorflow 2 is installed, project A fails at runtime - I would like to identify this error beforehand.

User Group 2: OCR-D users -> WORKFLOW TAB

As an OCR-D user, I would like to be able to quickly check which workflow is most suitable for my data (criteria: publication date, quality, font, layout, pages) in order to be able to select the appropriate workflow (then download and process it at my end). Vgl. https://github.com/OCR-D/zenhub/issues/42

User Group 3: developers who want to use or improve OCR-D -> PROCESSORS TAB

Pads as living documents about further refinement

https://pad.gwdg.de/jouzUjbwR5KvVzomazFy6w

https://pad.gwdg.de/y41n80-BR3yq2LtUsBl-8w

OCR-D / zenhub