Open PeterJunger opened 1 year ago
Structuring is currently done in the metadata editor. This feature request proposes an additional way of structuring, which is done with physical divider sheets by the user and automatically translated into structure elements by Kitodo.
How does the automatic structuring work? The user places physical divider sheets between the sheets of the workpiece to be scanned. These divider sheets represent the individual logical structures of the workpiece. Each divider sheet contains human- and machine-readable info to identify the type of logical structure that is represented by this sheet.
The workpiece is scanned together with the divider sheets. An automatic task in the workflow evaluates the scans. The machine-readable part on the divider sheets helps Kitodo to identify them and decide how to handle them. Kitodo can create a new logical structure from the information found on the divider sheet and assign all following pages to this new logical structure. The scans of the divider sheets are automatically removed after the structuring.
Divider sheets The divider sheets contain generic information about the logical structure they represent. (This information is displayed in human- and machine-readable form, e.g. a normal text and QR-code). For example, there might be one divider sheet for chapters, another for table of contents and a third one for the book cover. These divider sheets are not process- or workpiece-specific. They can be used in different processes and multiple divider sheets of the same type can be used in the same process (representing chapters for example). However the divider sheets are ruleset-specific, because they represent logical structures of a specific ruleset.
A new tab should be added to the "edit ruleset" page where the divider sheets for the ruleset can be configured. The configured divider sheets could be printed from a new button located in the templates list.
Example A physical workpiece might have the following structure after placing the divider sheets, for example:
The automatic structuring would create the following structure:
(Automatic pagination is not part of this feature. The page numbers are shown in this example for better readability.)
I estimate the costs for this development as high.
Structuring is currently done in the metadata editor. This feature request proposes an additional way of structuring, which is done with physical divider sheets by the user and automatically translated into structure elements by Kitodo.
How does the automatic structuring work? The user places physical divider sheets between the sheets of the workpiece to be scanned. These divider sheets represent the individual logical structures of the workpiece. Each divider sheet contains human- and machine-readable info to identify the type of logical structure that is represented by this sheet.
The workpiece is scanned together with the divider sheets. An automatic task in the workflow evaluates the scans. The machine-readable part on the divider sheets helps Kitodo to identify them and decide how to handle them. Kitodo can create a new logical structure from the information found on the divider sheet and assign all following pages to this new logical structure. The scans of the divider sheets are automatically removed after the structuring.
I welcome this proposal to add an (in part) already developed solution to the core features. Especially, I second the need for an “additional way of structuring” or more bluntly for a way of not having to use the graphical metadata editor. (Not that there is something bad about it in any way, but opening and using it is simply too cumbersome and time-consuming if it needs to be done for each item.)
As suggested, a generalization from the already existing implementation is needed and I think this should not only apply to the codebase but to the conceptualization in general.
If I understand correctly, the proposition as laid out above consists of mainly two new components:
From my point of view step 2b (automatic assignment of structural information) seems very important for a wide range of use cases. Its inclusion alone may justify support for this entire proposal on the one hand, on the other it should not be tied to one single use case like divider sheets.
What other use cases?
This is the thing which preoccupies me the most. Of course it is possible to insert image files of divider sheets in already existing collections. But if you have several million pre-existing images you'll have to automate that and if you want to get anything useful out of it, different workflows will need to be applied depending e.g. on the document type. In that case, you could make good use us a workflow engine such as Kitodo. But a workflow consisting of the steps 2a and 2b described above and an additional step 0 which is „add, depending on information xy, the divider sheets that will be removed again two steps later“ is, I hope we can all agree on that, not desireble. It would be far better if Kitodo could do the structuring depending on the given data itself. Thus:
There are certain document types that just will be structured exactly the same every time and should always be scanned/saved in order of this structure. I can also trust the scan personell to do this without having them put in divider sheets (I could also say that I can trust them equally or more to do so as I can trust them with inserting the sheets correctly) and adding them just costs time and is another repetitive and exhausting task. So let's just skip it and do the structuring dependent on the doctype given in the imported metadata. In existing collections (see above) structural information is also often represented in filenames and these could therefore could easily converted into actual metadata containing structural information. Additionally there are other file attributes (image size for example) that could easily be used to detect cover pages, empty pages, inserted tabular sheets or maps, if there is a need for that.
3. Automatic structuring by OCR/HTR.
This is where things are getting really interesting. Of course it is understandable that this won't be part of a first implementation but the potential is huge. It should at least be considered as something to keep in mind for future days if there no clear decision to keep this out of scope for good.
What should be changed about the proposal then?
The archives (Landesarchiv Hessen, Kreisarchiv Esslingen and Landesarchiv Schleswig-Holstein) wish to take over the function evaluate docket in order to enable te pre-distortion of the archival records.
As my comment may already suggest this has not been decided on and may at least in our case not be the exact feature we need.
3. Automatic structuring by OCR/HTR.
This is where things are getting really interesting. Of course it is understandable that this won't be part of a first implementation but the potential is huge. It should at least be considered as something to keep in mind for future days if there no clear decision to keep this out of scope for good.
A proposal towards this direction is actually being prepared.
Description
The Swiss federal archive proposes to take over the function evaluate docket in order to enable the pre-distortion of the archival records.
Related Issues
Enable the pre-distortion using bar code pages (order, envelope, document, documents, dossier, sub-dossier)
Expected Benefits of this Development
Archives need pre-distortion in order to map the structure of the archival records, as the structure has to be broken up for the digitisation process and is therefore no longer visible afterwards.
Estimated Costs and Complexity
The complexity of the development is medium.
e.g.